Re: [ovs-dev] [PATCH v2 01/26] ovn-northd-ddlog: Fix two memory leaks.

2021-04-01 Thread 0-day Robot
Bleep bloop.  Greetings Ben Pfaff, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: sha1 information is lacking or useless (northd/ovn-northd-ddlog.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0001 ovn-northd-ddlog: Fix two memory leaks.
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 25/26] ovn-northd-ddlog: Remove Router.static_routes.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

This is another instance of a performance bug when a change to a
router object forces a cascade of changes to relations that reference
the object.  This time around the problem was caused by the
`Router.static_routes` field, which is copied from
`nb::Logical_Router`.  Luckily, this field was only used in one rule
and was easy to remove.

Here is how we diagnosed the issue (this may be useful for posterity):

- We started with a benchmark that executed several hundreds of similar
  transactions (in this case, these transactions were adding new router
  ports).  We recorded execution of the benchmark in a DDlog command
  file (replay.txt) and added `timestamp;` commands after each
  transaction in the file.

- Run `make NORTHD_CLI=1` to generate the ovn_northd_cli executable and
  use it to execute the command file:
  ```
  ./ovn_northd_ddlog/target/release/ovn_northd_cli -w 1  < replay.txt > 
replay.dump
  ```

- Extract only the timestamps from replay.dump and plot differences
  between successive timestamps (i.e., individual transaction times).
  I use gnumeric.  It would be nice to have some automation for this
  in the future.  We observe that one of the transactions in the
  benchmark loop slows down linearly as the size of the network
  topology grows:
  https://gist.github.com/ryzhyk/16a5607b280ed9cd09b176d6816cb4f0
  Clearly some of the rules in the program are getting more expensive
  as the number of ports goes up.  Another interesting observation is
  that the size of the delta output at each iteration of the benchmark
  remains constant (the delta mostly consists of new network flows).
  This suggests that whatever extra work DDlog is doing at each
  iteration might be redundant.

- To identify where the wasted work happens, we re-compile the program
  passing the `--output-internal-relations` flag to DDlog, which tells it
  to dump changes to all intermediate relations, not just output
  relation.  We replay the trace again.  We locate the expensive
  transaction in the log and compare its output from one of the first
  iterations vs one from the end of the log.  We now see a whole bunch of
  intermediate relations that only had a few modified records in the
  first transaction versus hundreds in the second one.  We further
  observe that all of these changes simply update the `static_routes`
  field (as explained above).

- After removing the field, we observe that changes to intermediate
  relations no longer grow with the topology, and transaction timing
  increases much more slowly:
  https://gist.github.com/ryzhyk/d02784b9088d82f8549ea1b2ebdf095e

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/lrouter.dl | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index 38a19e7ea324..7ed5358c9431 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -434,7 +434,6 @@ typedef Router = Router {
 /* Fields copied from nb::Logical_Router. */
 _uuid:  uuid,
 name:   string,
-static_routes:  Set,
 policies:   Set,
 enabled:Option,
 nat:Set,
@@ -459,7 +458,6 @@ relation Router[Intern]
 Router[Router{
 ._uuid =lr._uuid,
 .name  =lr.name,
-.static_routes =lr.static_routes,
 .policies  =lr.policies,
 .enabled   =lr.enabled,
 .nat   =lr.nat,
@@ -730,7 +728,8 @@ RouterStaticRoute_(.router = router,
.nexthop = route.nexthop,
.output_port = route.output_port,
.ecmp_symmetric_reply = route.ecmp_symmetric_reply) :-
-router in &Router(.static_routes = routes),
+router in &Router(),
+nb::Logical_Router(._uuid = router._uuid, .static_routes = routes),
 var route_id = FlatMap(routes),
 route in &StaticRoute(.lrsr = nb::Logical_Router_Static_Route{._uuid = 
route_id}).
 
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 19/26] ovn-northd-ddlog: Remove unused function.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/lswitch.dl | 1 -
 1 file changed, 1 deletion(-)

diff --git a/northd/lswitch.dl b/northd/lswitch.dl
index aff265bbf82c..973faec5073f 100644
--- a/northd/lswitch.dl
+++ b/northd/lswitch.dl
@@ -23,7 +23,6 @@ import ipam
 import vec
 
 function is_enabled(lsp: nb::Logical_Switch_Port): bool { 
is_enabled(lsp.enabled) }
-function is_enabled(lsp: Ref): bool { 
lsp.deref().is_enabled() }
 function is_enabled(sp: SwitchPort): bool { sp.lsp.is_enabled() }
 function is_enabled(sp: Intern): bool { sp.lsp.is_enabled() }
 
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 22/26] ovn-northd-ddlog: Intern selected input relations.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

DDlog 0.38.0 adds the `--intern-table` CLI flag to the `ovsdb2ddlog`
compiler to declare input tables coming from OVSDB as `Intern<...>`.
This is useful for tables whose records are copied around as a whole and
can therefore benefit from interning performance- and memory-wise.  In
the past we had to create separate tables in `helpers.dl` and copy
records from the original input table to them while wrapping them in
`Intern<>`.  With this change, we avoid the extra copy and intern
records as we ingest them for selected tables.

We use the `--intern-table` flag to eliminate all intermediate tables in
`helpers.dl`.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/helpers.dl| 36 
 northd/lrouter.dl| 12 ++--
 northd/lswitch.dl| 26 +-
 northd/ovn-nb.dlopts |  8 
 northd/ovn-sb.dlopts |  1 +
 northd/ovn_northd.dl | 30 +++---
 northd/ovsdb2ddlog2c |  4 +++-
 7 files changed, 46 insertions(+), 71 deletions(-)

diff --git a/northd/helpers.dl b/northd/helpers.dl
index 49281fcafc9a..33a8d15d8b32 100644
--- a/northd/helpers.dl
+++ b/northd/helpers.dl
@@ -20,42 +20,6 @@ import ovn
 
 output relation Warning[string]
 
-/* ACLRef: reference to nb::ACL */
-relation ACLRef[Intern]
-ACLRef[acl.intern()] :- nb::ACL[acl].
-
-/* DHCP_Options: reference to nb::DHCP_Options */
-relation DHCP_OptionsRef[Intern]
-DHCP_OptionsRef[options.intern()] :- nb::DHCP_Options[options].
-
-/* QoS: reference to nb::QoS */
-relation QoSRef[Intern]
-QoSRef[qos.intern()] :- nb::QoS[qos].
-
-/* LoadBalancerRef: reference to nb::Load_Balancer */
-relation LoadBalancerRef[Intern]
-LoadBalancerRef[lb.intern()] :- nb::Load_Balancer[lb].
-
-/* LoadBalancerHealthCheckRef: reference to nb::Load_Balancer_Health_Check */
-relation LoadBalancerHealthCheckRef[Intern]
-LoadBalancerHealthCheckRef[lbhc.intern()] :- 
nb::Load_Balancer_Health_Check[lbhc].
-
-/* MeterRef: reference to nb::Meter*/
-relation MeterRef[Intern]
-MeterRef[meter.intern()] :- nb::Meter[meter].
-
-/* NATRef: reference to nb::NAT*/
-relation NATRef[Intern]
-NATRef[nat.intern()] :- nb::NAT[nat].
-
-/* AddressSetRef: reference to nb::Address_Set */
-relation AddressSetRef[Intern]
-AddressSetRef[__as.intern()] :- nb::Address_Set[__as].
-
-/* ServiceMonitor: reference to sb::Service_Monitor */
-relation ServiceMonitorRef[Intern]
-ServiceMonitorRef[sm.intern()] :- sb::Service_Monitor[sm].
-
 /* Switch-to-router logical port connections */
 relation SwitchRouterPeer(lsp: uuid, lsp_name: string, lrp: uuid)
 SwitchRouterPeer(lsp, lsp_name, lrp) :-
diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index 23d320be6cc7..c51f0fbe6c44 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -281,7 +281,7 @@ relation LogicalRouterNAT0(
 LogicalRouterNAT0(lr, nat, external_ip, external_mac) :-
 nb::Logical_Router(._uuid = lr, .nat = nats),
 var nat_uuid = FlatMap(nats),
-nat in &NATRef[nb::NAT{._uuid = nat_uuid}],
+nat in &nb::NAT(._uuid = nat_uuid),
 Some{var external_ip} = ip46_parse(nat.external_ip),
 var external_mac = match (nat.external_mac) {
 Some{s} -> eth_addr_from_string(s),
@@ -290,12 +290,12 @@ LogicalRouterNAT0(lr, nat, external_ip, external_mac) :-
 Warning["Bad ip address ${nat.external_ip} in nat configuration for router 
${lr_name}."] :-
 nb::Logical_Router(._uuid = lr, .nat = nats, .name = lr_name),
 var nat_uuid = FlatMap(nats),
-nat in &NATRef[nb::NAT{._uuid = nat_uuid}],
+nat in &nb::NAT(._uuid = nat_uuid),
 None = ip46_parse(nat.external_ip).
 Warning["Bad MAC address ${s} in nat configuration for router ${lr_name}."] :-
 nb::Logical_Router(._uuid = lr, .nat = nats, .name = lr_name),
 var nat_uuid = FlatMap(nats),
-nat in &NATRef[nb::NAT{._uuid = nat_uuid}],
+nat in &nb::NAT(._uuid = nat_uuid),
 Some{var s} = nat.external_mac,
 None = eth_addr_from_string(s).
 
@@ -308,12 +308,12 @@ LogicalRouterNAT(lr, NAT{nat, external_ip, external_mac, 
Some{AllowedExtIps{__as
 LogicalRouterNAT0(lr, nat, external_ip, external_mac),
 nat.exempted_ext_ips == None,
 Some{var __as_uuid} = nat.allowed_ext_ips,
-__as in &AddressSetRef[nb::Address_Set{._uuid = __as_uuid}].
+__as in &nb::Address_Set(._uuid = __as_uuid).
 LogicalRouterNAT(lr, NAT{nat, external_ip, external_mac, 
Some{ExemptedExtIps{__as}}}) :-
 LogicalRouterNAT0(lr, nat, external_ip, external_mac),
 nat.allowed_ext_ips == None,
 Some{var __as_uuid} = nat.exempted_ext_ips,
-__as in &AddressSetRef[nb::Address_Set{._uuid = __as_uuid}].
+__as in &nb::Address_Set(._uuid = __as_uuid).
 Warning["NAT rule: ${nat._uuid} not applied, since"
 "both allowed and exempt external ips set"] :-
 LogicalRouterNAT0(lr, nat, _, _),
@@ -404,7 +404,7 @@ relation LogicalRouterLB(lr: uuid, nat: 
Intern)
 LogicalRouterLB(lr, lb) :-
 nb::Logical_Router(._uuid = lr, .load_balancer = l

[ovs-dev] [PATCH v2 24/26] ovn-northd-ddlog: Intern nb::Logical_Switch_Port.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

Use the `--intern-table` switch to intern `Logical_Switch_Port` records,
so that they can be copied and compared efficiently by pointer.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/helpers.dl|  2 +-
 northd/lrouter.dl|  2 +-
 northd/lswitch.dl| 28 ++--
 northd/multicast.dl  |  6 +++---
 northd/ovn-nb.dlopts |  1 +
 northd/ovn_northd.dl | 40 
 6 files changed, 40 insertions(+), 39 deletions(-)

diff --git a/northd/helpers.dl b/northd/helpers.dl
index 820e37bb3072..757532e46c0e 100644
--- a/northd/helpers.dl
+++ b/northd/helpers.dl
@@ -23,7 +23,7 @@ output relation Warning[string]
 /* Switch-to-router logical port connections */
 relation SwitchRouterPeer(lsp: uuid, lsp_name: string, lrp: uuid)
 SwitchRouterPeer(lsp, lsp_name, lrp) :-
-nb::Logical_Switch_Port(._uuid = lsp, .name = lsp_name, .__type = 
"router", .options = options),
+&nb::Logical_Switch_Port(._uuid = lsp, .name = lsp_name, .__type = 
"router", .options = options),
 Some{var router_port} = options.get("router-port"),
 &nb::Logical_Router_Port(.name = router_port, ._uuid = lrp).
 
diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index 81e4a03e8a91..38a19e7ea324 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -92,7 +92,7 @@ FirstHopLogicalRouter(lrouter, lswitch) :-
 relation LogicalSwitchRouterPort(lsp: uuid, lsp_router_port: string, ls: uuid)
 LogicalSwitchRouterPort(lsp, lsp_router_port, ls) :-
   LogicalSwitchPort(lsp, ls),
-  nb::Logical_Switch_Port(._uuid = lsp, .__type = "router", .options = 
options),
+  &nb::Logical_Switch_Port(._uuid = lsp, .__type = "router", .options = 
options),
   Some{var lsp_router_port} = options.get("router-port").
 
 /*
diff --git a/northd/lswitch.dl b/northd/lswitch.dl
index 7a49ac17dbab..419117f743b3 100644
--- a/northd/lswitch.dl
+++ b/northd/lswitch.dl
@@ -22,7 +22,7 @@ import helpers
 import ipam
 import vec
 
-function is_enabled(lsp: nb::Logical_Switch_Port): bool { 
is_enabled(lsp.enabled) }
+function is_enabled(lsp: Intern): bool { 
is_enabled(lsp.enabled) }
 function is_enabled(sp: SwitchPort): bool { sp.lsp.is_enabled() }
 function is_enabled(sp: Intern): bool { sp.lsp.is_enabled() }
 
@@ -33,7 +33,7 @@ SwitchRouterPeerRef(lsp, Some{rport}) :-
 rport in &RouterPort(.lrp = &nb::Logical_Router_Port{._uuid = lrp}).
 
 SwitchRouterPeerRef(lsp, None) :-
-nb::Logical_Switch_Port(._uuid = lsp),
+&nb::Logical_Switch_Port(._uuid = lsp),
 not SwitchRouterPeer(lsp, _, _).
 
 /* LogicalSwitchPortCandidate.
@@ -50,7 +50,7 @@ Warning[message] :-
 LogicalSwitchPortCandidate(lsp_uuid, ls_uuid),
 var lss = ls_uuid.group_by(lsp_uuid).to_set(),
 lss.size() > 1,
-lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+lsp in &nb::Logical_Switch_Port(._uuid = lsp_uuid),
 var message = "Bad configuration: logical switch port ${lsp.name} belongs "
 "to more than one logical switch".
 
@@ -66,7 +66,7 @@ LogicalSwitchPort(lsp_uuid, ls_uuid) :-
 relation LogicalSwitchPortWithUnknownAddress(ls: uuid, lsp: uuid)
 LogicalSwitchPortWithUnknownAddress(ls_uuid, lsp_uuid) :-
 LogicalSwitchPort(lsp_uuid, ls_uuid),
-lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+lsp in &nb::Logical_Switch_Port(._uuid = lsp_uuid),
 lsp.is_enabled() and lsp.addresses.contains("unknown").
 
 relation LogicalSwitchHasUnknownPorts(ls: uuid, has_unknown: bool)
@@ -81,7 +81,7 @@ relation PortStaticAddresses(lsport: uuid, ip4addrs: 
Set, ip6addrs: Set<
 PortStaticAddresses(.lsport = port_uuid,
 .ip4addrs   = ip4_addrs.union(),
 .ip6addrs   = ip6_addrs.union()) :-
-nb::Logical_Switch_Port(._uuid = port_uuid, .addresses = addresses),
+&nb::Logical_Switch_Port(._uuid = port_uuid, .addresses = addresses),
 var address = FlatMap(if (addresses.is_empty()) { set_singleton("") } else 
{ addresses }),
 (var ip4addrs, var ip6addrs) = if (not is_dynamic_lsp_address(address)) {
 split_addresses(address)
@@ -133,7 +133,7 @@ relation LogicalSwitchLocalnetPort0(ls_uuid: uuid, lsp: 
(uuid, string))
 LogicalSwitchLocalnetPort0(ls_uuid, (lsp_uuid, lsp.name)) :-
 ls in nb::Logical_Switch(._uuid = ls_uuid),
 var lsp_uuid = FlatMap(ls.ports),
-lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+lsp in &nb::Logical_Switch_Port(._uuid = lsp_uuid),
 lsp.__type == "localnet".
 
 relation LogicalSwitchLocalnetPorts(ls_uuid: uuid, localnet_ports: Vec<(uuid, 
string)>)
@@ -173,7 +173,7 @@ relation LogicalSwitchHasNonRouterPort0(ls: uuid)
 LogicalSwitchHasNonRouterPort0(ls_uuid) :-
 ls in nb::Logical_Switch(._uuid = ls_uuid),
 var lsp_uuid = FlatMap(ls.ports),
-lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+lsp in &nb::Logical_Switch_Port(._uuid = lsp_uuid),
 lsp.__type != "router".
 
 relation LogicalSwitchHasNonRouterPort(ls: uuid, has_non_router_port: bool)
@@ -512

[ovs-dev] [PATCH v2 26/26] tutorial: Add benchmarking test script to run within sandbox.

2021-04-01 Thread Ben Pfaff
This is originally from Numan Siddique.  I have adapted it a bit
to run faster by using the ovn-nbctl and ovn-sbctl daemons and
combining multiple calls into just one.

I'm uncertain whether to actually commit this; I need a sign-off
from Numan to do so.

Signed-off-by: Ben Pfaff 
CC: Numan Siddique 
---
 tutorial/automake.mk  |  3 +-
 tutorial/northd_ddlog_test.sh | 81 +++
 2 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100755 tutorial/northd_ddlog_test.sh

diff --git a/tutorial/automake.mk b/tutorial/automake.mk
index 13b3bee055c9..f2571c2cfd98 100644
--- a/tutorial/automake.mk
+++ b/tutorial/automake.mk
@@ -6,7 +6,8 @@ EXTRA_DIST += \
tutorial/t-stage2 \
tutorial/t-stage3 \
tutorial/t-stage4 \
-   tutorial/ovn-setup.sh
+   tutorial/ovn-setup.sh \
+   tutorial/northd_ddlog_test.sh
 sandbox: all
cd $(srcdir)/tutorial && MAKE=$(MAKE) HAVE_OPENSSL=$(HAVE_OPENSSL) \
./ovs-sandbox -b $(abs_builddir) --ovs-src $(ovs_srcdir) 
--ovs-build $(ovs_builddir) $(SANDBOXFLAGS)
diff --git a/tutorial/northd_ddlog_test.sh b/tutorial/northd_ddlog_test.sh
new file mode 100755
index ..57e45d96228f
--- /dev/null
+++ b/tutorial/northd_ddlog_test.sh
@@ -0,0 +1,81 @@
+#!/bin/bash
+
+ddlog_running () {
+test -e sandbox/ovn-north-ddlog.pid
+}
+
+rm -f sandbox/profile-[0-9]*.txt
+if ddlog_running; then 
+ovs-appctl -t ovn-northd-ddlog enable-cpu-profiling
+fi
+
+export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
+export OVN_SB_DAEMON=$(ovn-sbctl --pidfile --detach)
+trap 'kill $(cat $OVN_RUNDIR/ovn-nbctl.pid) $(cat $OVN_RUNDIR/ovn-sbctl.pid)' 0
+
+ovn-nbctl set NB_Global . options:northd_probe_interval=18
+
+ovn-nbctl pg-add portGroupDefDeny
+ovn-nbctl pg-add portGroupMultiDefDeny
+ovn-nbctl lr-add cluster_router
+
+step () {
+lswitch_name=lswitch_17.${i}.0.0/16
+ext_switch=ext_ls_2.${i}.0.0/16
+ext_lrouter=ext_lr_2.${i}.0.0/16
+j=2
+port_name=lp_17.${i}.0.${j}
+port_ip=17.${i}.0.${j}
+np=networkPolicy-$i-$j
+ns=nameSpace-$i-$j
+mg=mcastPortGroup_$ns
+ovn-sbctl chassis-add ch$i geneve 128.0.0.$i
+ovn-nbctl --wait=sb \
+ls-add ${lswitch_name} -- \
+lrp-add cluster_router lr-$lswitch_name 00:00:00:00:ff:$i 
17.${i}.0.254/16 -- \
+lsp-add $lswitch_name $lswitch_name-lr -- \
+lsp-set-type $lswitch_name-lr router -- \
+lsp-set-addresses $lswitch_name-lr router -- \
+lsp-set-options $lswitch_name-lr router-port=lr-$lswitch_name -- \
+ls-add $ext_switch -- \
+lr-add $ext_lrouter -- \
+lrp-add $ext_lrouter extlr-$lswitch_name 00:00:00:10:af:$i 
2.${i}.0.254/16 -- \
+lsp-add $ext_switch $ext_switch-lr_2.$i -- \
+lsp-set-type $ext_switch-lr_2.$i router -- \
+lsp-set-addresses $ext_switch-lr_2.$i router -- \
+lsp-set-options $ext_switch-lr_2.$i router-port=extlr-$lswitch_name -- 
\
+lr-nat-add $ext_lrouter snat 2.${i}.0.100 17.${i}.0.0/16 -- \
+lr-route-add $ext_lrouter 17.${i}.1.0/16 20.0.0.2 -- \
+--policy="src-ip" lr-route-add $ext_lrouter 192.168.2.0/24 20.0.0.3 -- 
\
+--policy="src-ip" lr-route-add cluster_router 17.${i}.1.0/16 20.0.0.4 
-- \
+set logical_router $ext_lrouter options:chassis=ch$i -- \
+lsp-add ${lswitch_name}  ${port_name} -- \
+lsp-set-addresses ${port_name} "dynamic ${port_ip}" -- \
+--id=@lsp get logical_switch_port ${port_name} -- \
+add port_group portGroupDefDeny  ports @lsp -- \
+add port_group portGroupMultiDefDeny ports @lsp -- \
+pg-add $np $port_name -- \
+create Address_Set name=${np}_ingress_as addresses=$port_ip -- \
+create Address_Set name=${np}_egress_as addresses=$port_ip -- \
+acl-add $np from-lport 1010 "inport == @$np && ip4.src == 
${np}_ingress_as" allow -- \
+acl-add $np from-lport 1009 "inport == @$np && ip4" allow-related -- \
+acl-add $np to-lport 1010 "outport == @$np && ip4.dst == 
${np}_egress_as" allow -- \
+acl-add $np to-lport 1009 "outport == @$np && ip4" allow -- \
+create Address_Set name=$ns addresses=$port_ip -- \
+pg-add $mg $port_name -- \
+acl-add $mg from-lport 1012 "inport == @${mg} && ip4.mcast" allow -- \
+acl-add $mg to-lport 1012 "outport == @${mg} && ip4.mcast" allow 
>/dev/null
+ovn-sbctl lsp-bind $port_name ch$i
+
+if ddlog_running; then
+ovs-appctl -t ovn-northd-ddlog profile > sandbox/profile-$i.txt
+fi
+}
+
+rm -f timings
+i=1
+while [ $i -lt 255 ]
+do
+printf "step $i: "; TIMEFORMAT=%R; time step
+i=$((i+1))
+done
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 18/26] ovn-northd-ddlog: Intern the RouterPort table.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

Change the type of record in the `RouterPort` table from
`Ref` to `Intern`.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/lrouter.dl| 37 +
 northd/lswitch.dl|  4 ++--
 northd/ovn_northd.dl | 14 +++---
 3 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index b2b429af3c96..e4e5cbf9f212 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -23,7 +23,7 @@ import lswitch
 function is_enabled(lr: nb::Logical_Router): bool { is_enabled(lr.enabled) }
 function is_enabled(lrp: nb::Logical_Router_Port): bool { 
is_enabled(lrp.enabled) }
 function is_enabled(rp: RouterPort): bool { rp.lrp.is_enabled() }
-function is_enabled(rp: Ref): bool { rp.lrp.is_enabled() }
+function is_enabled(rp: Intern): bool { rp.lrp.is_enabled() }
 
 /* default logical flow prioriry for distributed routes */
 function dROUTE_PRIO(): bit<32> = 400
@@ -575,7 +575,7 @@ RouterPortHasBfd(lrp_uuid, false) :-
 
 /* FIXME: what should happen when extract_lrp_networks fails? */
 /* RouterPort relation collects all attributes of a logical router port */
-relation &RouterPort(
+typedef RouterPort = RouterPort {
 lrp:  nb::Logical_Router_Port,
 json_name:string,
 networks: lport_addresses,
@@ -584,17 +584,22 @@ relation &RouterPort(
 peer: RouterPeer,
 mcast_cfg:Ref,
 sb_options:   Map,
-has_bfd:  bool)
-
-&RouterPort(.lrp= lrp,
-.json_name  = json_string_escape(lrp.name),
-.networks   = networks,
-.router = router,
-.is_redirect= is_redirect,
-.peer   = peer,
-.mcast_cfg  = mcast_cfg,
-.sb_options = sb_options,
-.has_bfd= has_bfd) :-
+has_bfd:  bool
+}
+
+relation RouterPort[Intern]
+
+RouterPort[RouterPort{
+   .lrp= lrp,
+   .json_name  = json_string_escape(lrp.name),
+   .networks   = networks,
+   .router = router,
+   .is_redirect= is_redirect,
+   .peer   = peer,
+   .mcast_cfg  = mcast_cfg,
+   .sb_options = sb_options,
+   .has_bfd= has_bfd
+   }.intern()] :-
 nb::Logical_Router_Port[lrp],
 Some{var networks} = extract_lrp_networks(lrp.mac, lrp.networks),
 LogicalRouterPort(lrp._uuid, lrouter_uuid),
@@ -605,13 +610,13 @@ relation &RouterPort(
 RouterPortSbOptions(lrp._uuid, sb_options),
 RouterPortHasBfd(lrp._uuid, has_bfd).
 
-relation RouterPortNetworksIPv4Addr(port: Ref, addr: ipv4_netaddr)
+relation RouterPortNetworksIPv4Addr(port: Intern, addr: 
ipv4_netaddr)
 
 RouterPortNetworksIPv4Addr(port, addr) :-
 port in &RouterPort(.networks = networks),
 var addr = FlatMap(networks.ipv4_addrs).
 
-relation RouterPortNetworksIPv6Addr(port: Ref, addr: ipv6_netaddr)
+relation RouterPortNetworksIPv6Addr(port: Intern, addr: 
ipv6_netaddr)
 
 RouterPortNetworksIPv6Addr(port, addr) :-
 port in &RouterPort(.networks = networks),
@@ -733,7 +738,7 @@ RouterStaticRoute_(.router = router,
 typedef route_dst = RouteDst {
 nexthop: v46_ip,
 src_ip: v46_ip,
-port: Ref,
+port: Intern,
 ecmp_symmetric_reply: bool
 }
 
diff --git a/northd/lswitch.dl b/northd/lswitch.dl
index c089fadac863..aff265bbf82c 100644
--- a/northd/lswitch.dl
+++ b/northd/lswitch.dl
@@ -27,7 +27,7 @@ function is_enabled(lsp: Ref): bool 
{ lsp.deref().is_en
 function is_enabled(sp: SwitchPort): bool { sp.lsp.is_enabled() }
 function is_enabled(sp: Intern): bool { sp.lsp.is_enabled() }
 
-relation SwitchRouterPeerRef(lsp: uuid, rport: Option>)
+relation SwitchRouterPeerRef(lsp: uuid, rport: Option>)
 
 SwitchRouterPeerRef(lsp, Some{rport}) :-
 SwitchRouterPeer(lsp, _, lrp),
@@ -555,7 +555,7 @@ typedef SwitchPort = SwitchPort {
 lsp:nb::Logical_Switch_Port,
 json_name:  string,
 sw: Intern,
-peer:   Option>,
+peer:   Option>,
 static_addresses:   Vec,
 dynamic_address:Option,
 static_dynamic_mac: Option,
diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
index c4d46b2a8406..9e53821719c9 100644
--- a/northd/ovn_northd.dl
+++ b/northd/ovn_northd.dl
@@ -192,7 +192,7 @@ OutProxy_Port_Binding(._uuid  = lsp._uuid,
 Some{"router"} -> match ((l3dgw_port, opt_chassis, peer)) {
  (None, None, _) -> set_empty(),
  (_, _, None) -> set_empty(),
- (_, _, Some{rport}) -> 
get_nat_addresses(deref(rport))
+   

[ovs-dev] [PATCH v2 23/26] ovn-northd-ddlog: Intern nb::Logical_Router_Port.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

Use the `--intern-table` switch to intern `Logical_Router_Port` records,
so that they can be copied and compared efficiently by pointer.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/helpers.dl|  2 +-
 northd/lrouter.dl| 50 ++--
 northd/lswitch.dl|  2 +-
 northd/multicast.dl  |  2 +-
 northd/ovn-nb.dlopts |  1 +
 northd/ovn_northd.dl | 34 +++---
 6 files changed, 46 insertions(+), 45 deletions(-)

diff --git a/northd/helpers.dl b/northd/helpers.dl
index 33a8d15d8b32..820e37bb3072 100644
--- a/northd/helpers.dl
+++ b/northd/helpers.dl
@@ -25,7 +25,7 @@ relation SwitchRouterPeer(lsp: uuid, lsp_name: string, lrp: 
uuid)
 SwitchRouterPeer(lsp, lsp_name, lrp) :-
 nb::Logical_Switch_Port(._uuid = lsp, .name = lsp_name, .__type = 
"router", .options = options),
 Some{var router_port} = options.get("router-port"),
-nb::Logical_Router_Port(.name = router_port, ._uuid = lrp).
+&nb::Logical_Router_Port(.name = router_port, ._uuid = lrp).
 
 function get_bool_def(m: Map, k: string, def: bool): bool = {
 m.get(k)
diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index c51f0fbe6c44..81e4a03e8a91 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -21,7 +21,7 @@ import helpers
 import lswitch
 
 function is_enabled(lr: nb::Logical_Router): bool { is_enabled(lr.enabled) }
-function is_enabled(lrp: nb::Logical_Router_Port): bool { 
is_enabled(lrp.enabled) }
+function is_enabled(lrp: Intern): bool { 
is_enabled(lrp.enabled) }
 function is_enabled(rp: RouterPort): bool { rp.lrp.is_enabled() }
 function is_enabled(rp: Intern): bool { rp.lrp.is_enabled() }
 
@@ -42,7 +42,7 @@ Warning[message] :-
 LogicalRouterPortCandidate(lrp_uuid, lr_uuid),
 var lrs = lr_uuid.group_by(lrp_uuid).to_set(),
 lrs.size() > 1,
-lrp in nb::Logical_Router_Port(._uuid = lrp_uuid),
+lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid),
 var message = "Bad configuration: logical router port ${lrp.name} belongs "
 "to more than one logical router".
 
@@ -69,9 +69,9 @@ LogicalRouterPort(lrp_uuid, lr_uuid) :-
 relation PeerLogicalRouter(a: uuid, b: uuid)
 PeerLogicalRouter(lrp_uuid, peer._uuid) :-
   LogicalRouterPort(lrp_uuid, _),
-  lrp in nb::Logical_Router_Port(._uuid = lrp_uuid),
+  lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid),
   Some{var peer_name} = lrp.peer,
-  peer in nb::Logical_Router_Port(.name = peer_name),
+  peer in &nb::Logical_Router_Port(.name = peer_name),
   peer.peer == Some{lrp.name}, // 'peer' must point back to 'lrp'
   lrp_uuid != peer._uuid.  // No reflexive pointers.
 
@@ -86,7 +86,7 @@ PeerLogicalRouter(lrp_uuid, peer._uuid) :-
 relation FirstHopLogicalRouter(lrouter: uuid, lswitch: uuid)
 FirstHopLogicalRouter(lrouter, lswitch) :-
   LogicalRouterPort(lrp_uuid, lrouter),
-  lrp in nb::Logical_Router_Port(._uuid = lrp_uuid, .peer = None),
+  lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid, .peer = None),
   LogicalSwitchRouterPort(lsp_uuid, lrp.name, lswitch).
 
 relation LogicalSwitchRouterPort(lsp: uuid, lsp_router_port: string, ls: uuid)
@@ -119,7 +119,7 @@ ReachableLogicalRouter(a, a) :- ReachableLogicalRouter(a, 
_).
 
 // ha_chassis_group and gateway_chassis may not both be present.
 Warning[message] :-
-lrp in nb::Logical_Router_Port(),
+lrp in &nb::Logical_Router_Port(),
 lrp.ha_chassis_group.is_some(),
 not lrp.gateway_chassis.is_empty(),
 var message = "Both ha_chassis_group and gateway_chassis configured on "
@@ -127,7 +127,7 @@ Warning[message] :-
 
 // A distributed gateway port cannot also be an L3 gateway router.
 Warning[message] :-
-lrp in nb::Logical_Router_Port(),
+lrp in &nb::Logical_Router_Port(),
 lrp.ha_chassis_group.is_some() or not lrp.gateway_chassis.is_empty(),
 lrp.options.contains_key("chassis"),
 var message = "Bad configuration: distributed gateway port configured on "
@@ -143,7 +143,7 @@ relation DistributedGatewayPortCandidate(lr_uuid: uuid, 
lrp_uuid: uuid)
 DistributedGatewayPortCandidate(lr_uuid, lrp_uuid) :-
 lr in nb::Logical_Router(._uuid = lr_uuid),
 LogicalRouterPort(lrp_uuid, lr._uuid),
-lrp in nb::Logical_Router_Port(._uuid = lrp_uuid),
+lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid),
 not lrp.options.contains_key("chassis"),
 var has_hcg = lrp.ha_chassis_group.is_some(),
 var has_gc = not lrp.gateway_chassis.is_empty(),
@@ -161,13 +161,13 @@ Warning[message] :-
  * Each row means 'lrp' is the distributed gateway port on 'lr_uuid'.
  *
  * There is at most one distributed gateway port per logical router. */
-relation DistributedGatewayPort(lrp: nb::Logical_Router_Port, lr_uuid: uuid)
+relation DistributedGatewayPort(lrp: Intern, lr_uuid: 
uuid)
 DistributedGatewayPort(lrp, lr_uuid) :-
 DistributedGatewayPortCandidate(lr_uuid, lrp_uuid),
 var lrps = lrp_uuid.group_by(lr_uuid).to_set(),
 lrps.size() == 1,

[ovs-dev] [PATCH v2 17/26] ovn-northd-ddlog: Intern the SwitchPort table.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

Change the type of record in the `SwitchPort` table from
`Ref` to `Intern`.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/ipam.dl   | 22 +++
 northd/lswitch.dl| 64 +++-
 northd/ovn_northd.dl |  2 +-
 3 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/northd/ipam.dl b/northd/ipam.dl
index e7373f250a7f..40d542ec0687 100644
--- a/northd/ipam.dl
+++ b/northd/ipam.dl
@@ -212,7 +212,7 @@ SwitchPortAllocatedIPv4DynAddress(lsport, dyn_addr) :-
 Some{port0} -> {
 match (port0.sw.subnet) {
 None -> {
-abort("needs_dynamic_ipv4address is true, but subnet 
is undefined in port ${uuid2str(deref(port0).lsp._uuid)}");
+abort("needs_dynamic_ipv4address is true, but subnet 
is undefined in port ${uuid2str(port0.lsp._uuid)}");
 (0, 0)
 },
 Some{(_, _, start_ipv4, total_ipv4s)} -> (start_ipv4, 
total_ipv4s)
@@ -220,27 +220,27 @@ SwitchPortAllocatedIPv4DynAddress(lsport, dyn_addr) :-
 }
 };
 for (port in ports) {
-//warn("port(${deref(port).lsp._uuid})");
-match (deref(port).dynamic_address) {
+//warn("port(${port.lsp._uuid})");
+match (port.dynamic_address) {
 None -> {
 /* no dynamic address yet -- allocate one now */
-//warn("need_addr(${deref(port).lsp._uuid})");
-need_addr.push(deref(port).lsp._uuid)
+//warn("need_addr(${port.lsp._uuid})");
+need_addr.push(port.lsp._uuid)
 },
 Some{dynaddr} -> {
  match (dynaddr.ipv4_addrs.nth(0)) {
 None -> {
 /* dynamic address does not have IPv4 component -- 
allocate one now */
-//warn("need_addr(${deref(port).lsp._uuid})");
-need_addr.push(deref(port).lsp._uuid)
+//warn("need_addr(${port.lsp._uuid})");
+need_addr.push(port.lsp._uuid)
 },
 Some{addr} -> {
 var haddr = addr.addr.a;
 if (haddr < start_ipv4 or haddr >= start_ipv4 + 
total_ipv4s) {
-need_addr.push(deref(port).lsp._uuid)
+need_addr.push(port.lsp._uuid)
 } else if (used_addrs.contains(haddr)) {
-need_addr.push(deref(port).lsp._uuid);
-warn("Duplicate IP set on switch 
${deref(port).lsp.name}: ${addr.addr}")
+need_addr.push(port.lsp._uuid);
+warn("Duplicate IP set on switch 
${port.lsp.name}: ${addr.addr}")
 } else {
 /* has valid dynamic address -- record it in 
used_addrs */
 used_addrs.insert(haddr);
@@ -459,7 +459,7 @@ SwitchPortNewMACDynAddress(lsp._uuid, addr) :-
  * Dynamic IPv6 address allocation.
  * `needs_dynamic_ipv6address` -> mac.to_ipv6_eui64(ipv6_prefix)
  */
-relation SwitchPortNewDynamicAddress(port: Ref, address: 
Option)
+relation SwitchPortNewDynamicAddress(port: Intern, address: 
Option)
 
 SwitchPortNewDynamicAddress(port, None) :-
 port in &SwitchPort(.lsp = lsp),
diff --git a/northd/lswitch.dl b/northd/lswitch.dl
index 25abd0aa8189..c089fadac863 100644
--- a/northd/lswitch.dl
+++ b/northd/lswitch.dl
@@ -25,7 +25,7 @@ import vec
 function is_enabled(lsp: nb::Logical_Switch_Port): bool { 
is_enabled(lsp.enabled) }
 function is_enabled(lsp: Ref): bool { 
lsp.deref().is_enabled() }
 function is_enabled(sp: SwitchPort): bool { sp.lsp.is_enabled() }
-function is_enabled(sp: Ref): bool { sp.lsp.is_enabled() }
+function is_enabled(sp: Intern): bool { sp.lsp.is_enabled() }
 
 relation SwitchRouterPeerRef(lsp: uuid, rport: Option>)
 
@@ -445,7 +445,7 @@ LBVIPBackendStatus(lbvip, backend, true) :-
 
 /* SwitchPortDHCPv4Options: many-to-one relation between logical switches and 
DHCPv4 options */
 relation SwitchPortDHCPv4Options(
-port: Ref,
+port: Intern,
 dhcpv4_options: Ref)
 
 SwitchPortDHCPv4Options(port, options) :-
@@ -456,7 +456,7 @@ SwitchPortDHCPv4Options(port, options) :-
 
 /* SwitchPortDHCPv6Options: many-to-one relation between logical switches and 
DHCPv4 options */
 relation SwitchPortDHCPv6Options(
-port: Ref,
+port: Intern,
 dhcpv6_options: Ref)
 
 SwitchPortDHCPv6Options(port, options) :-
@@ -551,7 +551,7 @@ SwitchPortHAChassisGroup(lsp_uuid, None) :-
  * - `up`- true if the port is bound to a chassis or 
has type ""
  * - 'hac_group_uuid'- uuid o

[ovs-dev] [PATCH v2 21/26] ovn-northd-ddlog: Eliminate redundant dereferences.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

We eliminate an anti-pattern in the use of smart pointers that occurred
throughout the DDlog code.

Consider relation `A` that contains field `x` wrapped in a DDlog smart
pointer (`Intern<>` or `Ref<>`):

```
relation A(x: Intern, ...)
```

Here `T` might be a complex type with dynamically allocated fields like
vectors and maps, etc.  Here is how _not_ to use this relation in a
rule:

```
Rel(...) :- A(.x = &v),
B(v.field1),
C(v.field2).
```

The `&v` syntax here extracts the inner value that the smart pointer
points to and binds it to variable `v`.  Thus, the type of `v` is `T`
and we thread the value of `T` through the entire rule, which requires
creating two more copies of it (types not wrapped in smart pointers are
copied by value).  This is a waste of memory and CPU and is completely
unnecessary, as we can instead bind `v` to the value of the smart
pointer, so it can be copied efficiently:

```
Rel(...) :- A(.x = v),  // type of `v` is `Intern`.
B(v.field1),
C(v.field2).
```

The inefficient usage if a leftover from the days when DDlog had some
awkward restrictions on the use of smart pointers.

Note that `&` is still useful and does not incur any overhead when used
to deconstruct an object wrapped in a smart pointer and refer to its
fields, e.g.:

```
Rel(...) :- // Bind `f1` and `f2` to `x.field1` and `x.field2`;
// filter out records where `field3` is `false`.
A(.x = &T{.field1 = f1, .field2 = f2, .field3 = true}),
B(f1),
C(f2).
```

On top of this, the `@` operator can be used to simultaneously bind the
entire value stored in `x` and its individual fields.

```
Rel(...) :- // Bind `v` to the value of field `x` (`v: Intern`);
// bind `f2` to `x.field2`; filter on the value of `field3`.
A(.x = v @ &T{.field2 = f2, .field3 = true}),
B(v.field1),
C(f2).
```

The `&` in this rule is used to deconstruct the value inside the smart
pointer; however since the binding operator `@` preceeds `&`, we bind
`v` to the value of the smart pointer and not the type that it wraps.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/ipam.dl   |  14 +++---
 northd/multicast.dl  |   4 +-
 northd/ovn_northd.dl | 104 +--
 3 files changed, 61 insertions(+), 61 deletions(-)

diff --git a/northd/ipam.dl b/northd/ipam.dl
index 40d542ec0687..da71b2872952 100644
--- a/northd/ipam.dl
+++ b/northd/ipam.dl
@@ -161,7 +161,7 @@ SwitchIPv4ReservedAddress(.lswitch = ls_uuid,
 &SwitchPort(
 .sw = &Switch{._uuid = ls_uuid,
   .subnet = Some{(_, _, start_ipv4, total_ipv4s)}},
-.peer = Some{&rport}),
+.peer = Some{rport}),
 var addrs = {
 var addrs = set_empty();
 for (addr in rport.networks.ipv4_addrs) {
@@ -177,7 +177,7 @@ SwitchIPv4ReservedAddress(.lswitch = ls_uuid,
 /* Add reserved address group (5) */
 SwitchIPv4ReservedAddress(.lswitch = sw._uuid,
   .addr= ip_addr.a) :-
-&SwitchPort(.sw = &sw, .lsp = lsp, .static_dynamic_ipv4 = Some{ip_addr}).
+&SwitchPort(.sw = sw, .lsp = lsp, .static_dynamic_ipv4 = Some{ip_addr}).
 
 /* Aggregate all reserved addresses for each switch. */
 relation SwitchIPv4ReservedAddresses(lswitch: uuid, addrs: Set>)
@@ -197,7 +197,7 @@ relation SwitchPortAllocatedIPv4DynAddress(lsport: uuid, 
dyn_addr: Option)
 
 SwitchPortNewIPv4DynAddress(lsp._uuid, ip_addr) :-
-&SwitchPort(.sw = &sw,
+&SwitchPort(.sw = sw,
 .needs_dynamic_ipv4address = false,
 .static_dynamic_ipv4 = static_dynamic_ipv4,
 .lsp = lsp),
@@ -333,7 +333,7 @@ ReservedMACAddress(.addr = mac_addr.ha) :-
 
 /* Add reserved address group (3). */
 ReservedMACAddress(.addr = rport.networks.ea.ha) :-
-&SwitchPort(.peer = Some{&rport}).
+&SwitchPort(.peer = Some{rport}).
 
 /* Aggregate all reserved MAC addresses. */
 relation ReservedMACAddresses(addrs: Set>)
@@ -430,7 +430,7 @@ relation SwitchPortNewMACDynAddress(lsport: uuid, dyn_addr: 
Option)
 SwitchPortNewMACDynAddress(lsp._uuid, mac_addr) :-
 &SwitchPort(.needs_dynamic_macaddress = false,
 .lsp = lsp,
-.sw = &sw,
+.sw = sw,
 .static_dynamic_mac = static_dynamic_mac),
 var mac_addr = match (static_dynamic_mac) {
 None -> None,
@@ -467,7 +467,7 @@ SwitchPortNewDynamicAddress(port, None) :-
 
 SwitchPortNewDynamicAddress(port, lport_address) :-
 port in &SwitchPort(.lsp = lsp,
-.sw = &sw,
+.sw = sw,
 .needs_dynamic_ipv6address = needs_dynamic_ipv6address,
 .static_dynamic_ipv6 = static_dynamic_ipv6),
 SwitchPortNewMACDynAddress(lsp._uuid, Some{mac_addr}),
diff --git a/northd/multicast.dl b/northd/multicast.dl
index

[ovs-dev] [PATCH v2 15/26] ovn-northd-ddlog: Intern the Switch table.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

Change the type of record in the `Switch` table from `Ref` to
`Intern`.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/lswitch.dl   | 37 +
 northd/multicast.dl | 10 +-
 2 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/northd/lswitch.dl b/northd/lswitch.dl
index 47c497e0cff7..218272206e05 100644
--- a/northd/lswitch.dl
+++ b/northd/lswitch.dl
@@ -186,7 +186,7 @@ LogicalSwitchHasNonRouterPort(ls, false) :-
 
 /* Switch relation collects all attributes of a logical switch */
 
-relation &Switch(
+typedef Switch = Switch {
 ls:nb::Logical_Switch,
 has_stateful_acl:  bool,
 has_lb_vip:bool,
@@ -200,7 +200,10 @@ relation &Switch(
 
 /* Does this switch have at least one port with type != "router"? */
 has_non_router_port: bool
-)
+}
+
+
+relation Switch[Intern]
 
 function ipv6_parse_prefix(s: string): Option {
 if (string_contains(s, "/")) {
@@ -213,17 +216,19 @@ function ipv6_parse_prefix(s: string): Option {
 }
 }
 
-&Switch(.ls= ls,
-.has_stateful_acl  = has_stateful_acl,
-.has_lb_vip= has_lb_vip,
-.has_dns_records   = has_dns_records,
-.has_unknown_ports = has_unknown_ports,
-.localnet_ports= localnet_ports,
-.subnet= subnet,
-.ipv6_prefix   = ipv6_prefix,
-.mcast_cfg = mcast_cfg,
-.has_non_router_port = has_non_router_port,
-.is_vlan_transparent = is_vlan_transparent) :-
+Switch[Switch{
+   .ls= ls,
+   .has_stateful_acl  = has_stateful_acl,
+   .has_lb_vip= has_lb_vip,
+   .has_dns_records   = has_dns_records,
+   .has_unknown_ports = has_unknown_ports,
+   .localnet_ports= localnet_ports,
+   .subnet= subnet,
+   .ipv6_prefix   = ipv6_prefix,
+   .mcast_cfg = mcast_cfg,
+   .has_non_router_port = has_non_router_port,
+   .is_vlan_transparent = is_vlan_transparent
+   }.intern()] :-
 nb::Logical_Switch[ls],
 LogicalSwitchHasStatefulACL(ls._uuid, has_stateful_acl),
 LogicalSwitchHasLBVIP(ls._uuid, has_lb_vip),
@@ -449,7 +454,7 @@ SwitchPortDHCPv6Options(port, options) :-
 options in &DHCP_OptionsRef[nb::DHCP_Options{._uuid = dhcpv6_uuid}].
 
 /* SwitchQoS: many-to-one relation between logical switches and nb::QoS */
-relation SwitchQoS(sw: Ref, qos: Ref)
+relation SwitchQoS(sw: Intern, qos: Ref)
 
 SwitchQoS(sw, qos) :-
 sw in &Switch(.ls = nb::Logical_Switch{.qos_rules = qos_rules}),
@@ -475,7 +480,7 @@ ACLWithFairMeter(acl, meter) :-
 meter in &MeterRef[nb::Meter{.name = meter_name, .fair = Some{true}}].
 
 /* SwitchACL: many-to-many relation between logical switches and ACLs */
-relation &SwitchACL(sw: Ref,
+relation &SwitchACL(sw: Intern,
 acl: Ref,
 has_fair_meter: bool)
 
@@ -536,7 +541,7 @@ SwitchPortHAChassisGroup(lsp_uuid, None) :-
 relation &SwitchPort(
 lsp:nb::Logical_Switch_Port,
 json_name:  string,
-sw: Ref,
+sw: Intern,
 peer:   Option>,
 static_addresses:   Vec,
 dynamic_address:Option,
diff --git a/northd/multicast.dl b/northd/multicast.dl
index 9b0fa80738d7..5a14a90da1cd 100644
--- a/northd/multicast.dl
+++ b/northd/multicast.dl
@@ -100,7 +100,7 @@ relation &McastPortCfg(
 /* Mapping between Switch and the set of router port uuids on which to flood
  * IP multicast for relay.
  */
-relation SwitchMcastFloodRelayPorts(sw: Ref, ports: Set)
+relation SwitchMcastFloodRelayPorts(sw: Intern, ports: Set)
 
 SwitchMcastFloodRelayPorts(switch, relay_ports) :-
 &SwitchPort(
@@ -124,7 +124,7 @@ SwitchMcastFloodRelayPorts(switch, set_empty()) :-
 /* Mapping between Switch and the set of port uuids on which to
  * flood IP multicast statically.
  */
-relation SwitchMcastFloodPorts(sw: Ref, ports: Set)
+relation SwitchMcastFloodPorts(sw: Intern, ports: Set)
 
 SwitchMcastFloodPorts(switch, flood_ports) :-
 &SwitchPort(
@@ -142,7 +142,7 @@ SwitchMcastFloodPorts(switch, set_empty()) :-
 /* Mapping between Switch and the set of port uuids on which to
  * flood IP multicast reports statically.
  */
-relation SwitchMcastFloodReportPorts(sw: Ref, ports: Set)
+relation SwitchMcastFloodReportPorts(sw: Intern, ports: Set)
 
 SwitchMcastFloodReportPorts(switch, flood_ports) :-
 &SwitchPort(
@@ -179,7 +179,7 @@ RouterMcastFloodPorts(router, set_empty()) :-
 /* Flattened IGMP group. One record per address-port tuple. */
 relation IgmpSwitchGroupPort(
 address: string,
-switch : Ref,
+switch : Intern,
 port   : uuid
 )
 
@@ -199,7 +199,7 @@ IgmpSwitchGroupPort(address, switch, localnet_port.0) :-
  */
 relation IgmpSwitchMulticastGroup(
 addre

[ovs-dev] [PATCH v2 20/26] ovn-northd-ddlog: Eliminate remaining Ref's.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

Change all remaining occurrences of `Ref` to `Intern` throughout
the DDlog code base.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/helpers.dl| 36 +-
 northd/lrouter.dl| 22 
 northd/lswitch.dl| 61 
 northd/multicast.dl  | 49 ---
 northd/ovn_northd.dl |  8 +++---
 5 files changed, 95 insertions(+), 81 deletions(-)

diff --git a/northd/helpers.dl b/northd/helpers.dl
index 32a5526d59d5..49281fcafc9a 100644
--- a/northd/helpers.dl
+++ b/northd/helpers.dl
@@ -21,40 +21,40 @@ import ovn
 output relation Warning[string]
 
 /* ACLRef: reference to nb::ACL */
-relation &ACLRef[nb::ACL]
-&ACLRef[acl] :- nb::ACL[acl].
+relation ACLRef[Intern]
+ACLRef[acl.intern()] :- nb::ACL[acl].
 
 /* DHCP_Options: reference to nb::DHCP_Options */
-relation &DHCP_OptionsRef[nb::DHCP_Options]
-&DHCP_OptionsRef[options] :- nb::DHCP_Options[options].
+relation DHCP_OptionsRef[Intern]
+DHCP_OptionsRef[options.intern()] :- nb::DHCP_Options[options].
 
 /* QoS: reference to nb::QoS */
-relation &QoSRef[nb::QoS]
-&QoSRef[qos] :- nb::QoS[qos].
+relation QoSRef[Intern]
+QoSRef[qos.intern()] :- nb::QoS[qos].
 
 /* LoadBalancerRef: reference to nb::Load_Balancer */
-relation &LoadBalancerRef[nb::Load_Balancer]
-&LoadBalancerRef[lb] :- nb::Load_Balancer[lb].
+relation LoadBalancerRef[Intern]
+LoadBalancerRef[lb.intern()] :- nb::Load_Balancer[lb].
 
 /* LoadBalancerHealthCheckRef: reference to nb::Load_Balancer_Health_Check */
-relation &LoadBalancerHealthCheckRef[nb::Load_Balancer_Health_Check]
-&LoadBalancerHealthCheckRef[lbhc] :- nb::Load_Balancer_Health_Check[lbhc].
+relation LoadBalancerHealthCheckRef[Intern]
+LoadBalancerHealthCheckRef[lbhc.intern()] :- 
nb::Load_Balancer_Health_Check[lbhc].
 
 /* MeterRef: reference to nb::Meter*/
-relation &MeterRef[nb::Meter]
-&MeterRef[meter] :- nb::Meter[meter].
+relation MeterRef[Intern]
+MeterRef[meter.intern()] :- nb::Meter[meter].
 
 /* NATRef: reference to nb::NAT*/
-relation &NATRef[nb::NAT]
-&NATRef[nat] :- nb::NAT[nat].
+relation NATRef[Intern]
+NATRef[nat.intern()] :- nb::NAT[nat].
 
 /* AddressSetRef: reference to nb::Address_Set */
-relation &AddressSetRef[nb::Address_Set]
-&AddressSetRef[__as] :- nb::Address_Set[__as].
+relation AddressSetRef[Intern]
+AddressSetRef[__as.intern()] :- nb::Address_Set[__as].
 
 /* ServiceMonitor: reference to sb::Service_Monitor */
-relation &ServiceMonitorRef[sb::Service_Monitor]
-&ServiceMonitorRef[sm] :- sb::Service_Monitor[sm].
+relation ServiceMonitorRef[Intern]
+ServiceMonitorRef[sm.intern()] :- sb::Service_Monitor[sm].
 
 /* Switch-to-router logical port connections */
 relation SwitchRouterPeer(lsp: uuid, lsp_name: string, lrp: uuid)
diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index e4e5cbf9f212..23d320be6cc7 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -263,11 +263,11 @@ LogicalRouterRedirectPort(lr, None) :-
 nb::Logical_Router(._uuid = lr),
 not DistributedGatewayPort(_, lr).
 
-typedef ExceptionalExtIps = AllowedExtIps{ips: Ref}
-  | ExemptedExtIps{ips: Ref}
+typedef ExceptionalExtIps = AllowedExtIps{ips: Intern}
+  | ExemptedExtIps{ips: Intern}
 
 typedef NAT = NAT{
-nat: Ref,
+nat: Intern,
 external_ip: v46_ip,
 external_mac: Option,
 exceptional_ext_ips: Option
@@ -275,7 +275,7 @@ typedef NAT = NAT{
 
 relation LogicalRouterNAT0(
 lr: uuid,
-nat: Ref,
+nat: Intern,
 external_ip: v46_ip,
 external_mac: Option)
 LogicalRouterNAT0(lr, nat, external_ip, external_mac) :-
@@ -399,14 +399,14 @@ LogicalRouterSnatIPs(lr._uuid, map_empty()) :-
 lr in nb::Logical_Router(),
 not LogicalRouterSnatIP(.lr = lr._uuid).
 
-relation LogicalRouterLB(lr: uuid, nat: Ref)
+relation LogicalRouterLB(lr: uuid, nat: Intern)
 
 LogicalRouterLB(lr, lb) :-
 nb::Logical_Router(._uuid = lr, .load_balancer = lbs),
 var lb_uuid = FlatMap(lbs),
 lb in &LoadBalancerRef[nb::Load_Balancer{._uuid = lb_uuid}].
 
-relation LogicalRouterLBs(lr: uuid, nat: Vec>)
+relation LogicalRouterLBs(lr: uuid, nat: Vec>)
 
 LogicalRouterLBs(lr, lbs) :-
  LogicalRouterLB(lr, lb),
@@ -448,8 +448,8 @@ typedef Router = Router {
 is_gateway: bool,
 nats:   Vec,
 snat_ips:   Map>,
-lbs:Vec>,
-mcast_cfg:  Ref,
+lbs:Vec>,
+mcast_cfg:  Intern,
 learn_from_arp_request: bool,
 force_lb_snat: bool,
 }
@@ -491,7 +491,7 @@ Router[Router{
 var force_lb_snat = lb_force_snat_router_ip(lr.options).
 
 /* RouterLB: many-to-many relation between logical routers and nb::LB */
-relation RouterLB(router: Intern, lb: Ref)
+relation RouterLB(router: Intern, lb: Intern)
 
 RouterLB(router, lb) :-
 router in &Router(.lbs = lbs),
@@ -500,7 +500,7 @@ RouterLB(router, lb) :-
 /* Load balancer VIPs ass

[ovs-dev] [PATCH v2 14/26] ovn-northd-ddlog: Workaround for slow group_by.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

This patch is a workaround for a performance issue in the DDlog
compiler.  The issue will hopefully be resolved in a future version of
DDlog, but for now we need this and possibly a few other similar fixes.

Here is one affected rule:

```
sb::Out_Port_Group(._uuid = hash128(sb_name), .name = sb_name, .ports = 
port_names) :-
nb::Port_Group(._uuid = _uuid, .name = nb_name, .ports = pg_ports),
var port_uuid = FlatMap(pg_ports),
&SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{._uuid = port_uuid,
   .name = port_name},
.sw = &Switch{.ls = nb::Logical_Switch{._uuid = ls_uuid}}),
TunKeyAllocation(.datapath = ls_uuid, .tunkey = tunkey),
var sb_name = "${tunkey}_${nb_name}",
var port_names = port_name.group_by((_uuid, sb_name)).to_set().
```

The first literal in the body of the rule binds variable `pg_ports` to
the array of ports in the port group.  This is a potentially large
array that immediately gets flattened by the `FlatMap` operator.
Since the `pg_ports` variable is not used in the remainder of the rule,
DDlog normally would not propagate it through the rest of the rule.
Unfortunately, due to a subtle semantic quirk, the behavior is different
when there is a `group_by` operator further down in the rule, in which
case unused variables are still propagated through the rule, which
involves expensive copies.

The workaround I implemented factors the first two terms in the
rule into a separate `PortGroupPort` relation, so that the ports array
no longer occurs in the new version of the rule:

```
sb::Out_Port_Group(._uuid = hash128(sb_name), .name = sb_name, .ports = 
port_names) :-
PortGroupPort(.pg_uuid = _uuid, .pg_name = nb_name, .port = port_uuid),
&SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{._uuid = port_uuid,
   .name = port_name},
.sw = &Switch{.ls = nb::Logical_Switch{._uuid = ls_uuid}}),
TunKeyAllocation(.datapath = ls_uuid, .tunkey = tunkey),
var sb_name = "${tunkey}_${nb_name}",
var port_names = port_name.group_by((_uuid, sb_name)).to_set().
```

Again, benchmarking is likely to reveal more instances of this.  A
proper fix will require a change to the DDlog compiler.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/ovn_northd.dl | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
index 80d8598bd7dc..5a7a11295964 100644
--- a/northd/ovn_northd.dl
+++ b/northd/ovn_northd.dl
@@ -712,11 +712,10 @@ sb::Out_Address_Set(._uuid = hash128("svc_monitor_mac"),
 SvcMonitorMac(svc_monitor_mac).
 
 sb::Out_Address_Set(hash128(as_name), as_name, pg_ip4addrs.union()) :-
-nb::Port_Group(.ports = pg_ports, .name = pg_name),
+PortGroupPort(.pg_name = pg_name, .port = port_uuid),
 var as_name = pg_name ++ "_ip4",
 // avoid name collisions with user-defined Address_Sets
 not nb::Address_Set(.name = as_name),
-var port_uuid = FlatMap(pg_ports),
 PortStaticAddresses(.lsport = port_uuid, .ip4addrs = stat),
 SwitchPortNewDynamicAddress(&SwitchPort{.lsp = 
nb::Logical_Switch_Port{._uuid = port_uuid}},
 dyn_addr),
@@ -738,11 +737,10 @@ sb::Out_Address_Set(hash128(as_name), as_name, 
set_empty()) :-
 not nb::Address_Set(.name = as_name).
 
 sb::Out_Address_Set(hash128(as_name), as_name, pg_ip6addrs.union()) :-
-nb::Port_Group(.ports = pg_ports, .name = pg_name),
+PortGroupPort(.pg_name = pg_name, .port = port_uuid),
 var as_name = pg_name ++ "_ip6",
 // avoid name collisions with user-defined Address_Sets
 not nb::Address_Set(.name = as_name),
-var port_uuid = FlatMap(pg_ports),
 PortStaticAddresses(.lsport = port_uuid, .ip6addrs = stat),
 SwitchPortNewDynamicAddress(&SwitchPort{.lsp = 
nb::Logical_Switch_Port{._uuid = port_uuid}},
 dyn_addr),
@@ -771,9 +769,18 @@ sb::Out_Address_Set(hash128(as_name), as_name, 
set_empty()) :-
  * SB Port_Group.name uniqueness constraint, ovn-northd populates the field
  * with the value: _.
  */
+
+relation PortGroupPort(
+pg_uuid: uuid,
+pg_name: string,
+port: uuid)
+
+PortGroupPort(pg_uuid, pg_name, port) :-
+nb::Port_Group(._uuid = pg_uuid, .name = pg_name, .ports = pg_ports),
+var port = FlatMap(pg_ports).
+
 sb::Out_Port_Group(._uuid = hash128(sb_name), .name = sb_name, .ports = 
port_names) :-
-nb::Port_Group(._uuid = _uuid, .name = nb_name, .ports = pg_ports),
-var port_uuid = FlatMap(pg_ports),
+PortGroupPort(.pg_uuid = _uuid, .pg_name = nb_name, .port = port_uuid),
 &SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{._uuid = port_uuid,
.name = port_name},
 .sw = &Switch{.ls = nb::Logical_Switch{._uuid = ls_uuid}}),
-- 
2.29.2

_

[ovs-dev] [PATCH v2 16/26] ovn-northd-ddlog: Remove `ls` field from `Switch`.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

This commit is analogous to 076749c99, but switches instead of routers.

`relation Switch` stores the internal representation of a logical
switch, consisting of values from the `nb::Logical_Switch` table
augmented with some additional fields.  We used to do this by
copying the entire `Logical_Switch` record inside `Switch`.  This
proved highly inefficient in scenarios where some of the entities that
`Logical_Switch` references (logicl switch ports, ACLs, or QoS rules)
change frequently.  Every such change modifies the `Logical_Switch`
record, which triggers an update of the `Switch` object, which can cause
a bunch of rules to update their outputs.

As a workaround, we no longer store the entire `Logical_Switch` object
in the `Switch` table, and instead only copy its relevant fields.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/ipam.dl   |  25 ++-
 northd/lswitch.dl|  24 ++-
 northd/ovn_northd.dl | 356 ++-
 3 files changed, 210 insertions(+), 195 deletions(-)

diff --git a/northd/ipam.dl b/northd/ipam.dl
index 589126f81288..e7373f250a7f 100644
--- a/northd/ipam.dl
+++ b/northd/ipam.dl
@@ -95,18 +95,17 @@ function parse_dynamic_address_request(s: string): 
Option)
 
 /* Add reserved address groups (1) and (2). */
-SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
+SwitchIPv4ReservedAddress(.lswitch = sw._uuid,
   .addr= addr) :-
-&Switch(.ls = ls,
-.subnet = Some{(_, _, start_ipv4, total_ipv4s)}),
+sw in &Switch(.subnet = Some{(_, _, start_ipv4, total_ipv4s)}),
 var exclude_ips = {
 var exclude_ips = set_singleton(start_ipv4);
 exclude_ips.insert(start_ipv4 + total_ipv4s - 1);
-match (map_get(ls.other_config, "exclude_ips")) {
+match (map_get(sw.other_config, "exclude_ips")) {
 None -> exclude_ips,
 Some{exclude_ip_list} -> match (parse_ip_list(exclude_ip_list)) {
 Left{err} -> {
-warn("logical switch ${uuid2str(ls._uuid)}: bad 
exclude_ips (${err})");
+warn("logical switch ${uuid2str(sw._uuid)}: bad 
exclude_ips (${err})");
 exclude_ips
 },
 Right{ranges} -> {
@@ -124,7 +123,7 @@ SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
 exclude_ips.insert(addr)
 }
 } else {
-warn("logical switch ${uuid2str(ls._uuid)}: 
excluded addresses not in subnet")
+warn("logical switch ${uuid2str(sw._uuid)}: 
excluded addresses not in subnet")
 }
 };
 exclude_ips
@@ -135,11 +134,11 @@ SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
 var addr = FlatMap(exclude_ips).
 
 /* Add reserved address group (3). */
-SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
+SwitchIPv4ReservedAddress(.lswitch = ls_uuid,
   .addr= addr) :-
 SwitchPortStaticAddresses(
 .port = &SwitchPort{
-.sw = &Switch{.ls = ls,
+.sw = &Switch{._uuid = ls_uuid,
   .subnet = Some{(_, _, start_ipv4, total_ipv4s)}},
 .peer = None},
 .addrs = lport_addrs
@@ -157,10 +156,10 @@ SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
 var addr = FlatMap(addrs).
 
 /* Add reserved address group (4) */
-SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
+SwitchIPv4ReservedAddress(.lswitch = ls_uuid,
   .addr= addr) :-
 &SwitchPort(
-.sw = &Switch{.ls = ls,
+.sw = &Switch{._uuid = ls_uuid,
   .subnet = Some{(_, _, start_ipv4, total_ipv4s)}},
 .peer = Some{&rport}),
 var addrs = {
@@ -176,7 +175,7 @@ SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
 var addr = FlatMap(addrs).
 
 /* Add reserved address group (5) */
-SwitchIPv4ReservedAddress(.lswitch = sw.ls._uuid,
+SwitchIPv4ReservedAddress(.lswitch = sw._uuid,
   .addr= ip_addr.a) :-
 &SwitchPort(.sw = &sw, .lsp = lsp, .static_dynamic_ipv4 = Some{ip_addr}).
 
@@ -199,7 +198,7 @@ SwitchPortAllocatedIPv4DynAddress(lsport, dyn_addr) :-
 /* Aggregate all ports of a switch that need a dynamic IP address */
 port in &SwitchPort(.needs_dynamic_ipv4address = true,
 .sw = &sw),
-var switch_id = sw.ls._uuid,
+var switch_id = sw._uuid,
 var ports = port.group_by(switch_id).to_vec(),
 SwitchIPv4ReservedAddresses(switch_id, reserved_addrs),
 /* Allocate dynamic addresses only for ports that don't have a dynamic 
address
@@ -437,7 +436,7 @@ SwitchPortNewMACDynAddress(lsp._uuid, mac_addr) :-
 None -> None,
 Some{addr} -> {
 if (sw.subnet.is_some() or sw.ipv6_prefix.is_some() or
-map_get(sw.ls.other_config, "mac_only"

[ovs-dev] [PATCH v2 12/26] ovn-northd-ddlog: Remove `lr` field from `Router`.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

`relation Router` stores the internal representation of a logical
router, consisting of values from the `nb::Logical_Router` table
augmented with some additional fields.  We used to do this by
copying the entire `Logical_Router` record inside `Router`.  This
proved highly inefficient in scenarios where the set of router ports
changes frequently.  Every such change modifies the `ports` array
inside `Logical_Router`, which triggers an update of the `Router`
object, which can cause a bunch of rules to update their outputs.  This
recomputation is unnecessary as none of these rules look at the `ports`
field (`ports` is a slightly backwards way to maintain the relationship
between ports and routers by storing the array of ports in the router
instead of having each port point to the router).

As a workaround, we no longer store the entire `Logical_Router` object
in the `Router` table, and instead only copy its relevant fields.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/lrouter.dl|  47 --
 northd/ovn_northd.dl | 374 +--
 2 files changed, 220 insertions(+), 201 deletions(-)

diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index e3afff72f41d..574926b73b67 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -329,10 +329,10 @@ LogicalRouterNATs(lr, vec_empty()) :-
 nb::Logical_Router(._uuid = lr),
 not LogicalRouterNAT(lr, _).
 
-function get_force_snat_ip(lr: nb::Logical_Router, key_type: string): 
Set =
+function get_force_snat_ip(options: Map, key_type: string): 
Set =
 {
 var ips = set_empty();
-match (lr.options.get(key_type ++ "_force_snat_ip")) {
+match (options.get(key_type ++ "_force_snat_ip")) {
 None -> (),
 Some{s} -> {
 for (token in s.split(" ")) {
@@ -346,8 +346,8 @@ function get_force_snat_ip(lr: nb::Logical_Router, 
key_type: string): Set, key_type: string): 
bool {
+not get_force_snat_ip(options, key_type).is_empty()
 }
 
 function lb_force_snat_router_ip(lr_options: Map): bool {
@@ -355,8 +355,8 @@ function lb_force_snat_router_ip(lr_options: Map): bool {
 lr_options.contains_key("chassis")
 }
 
-function force_snat_for_lb(lr: nb::Logical_Router): bool {
-not get_force_snat_ip(lr, "lb").is_empty() or 
lb_force_snat_router_ip(lr.options)
+function force_snat_for_lb(lr_options: Map): bool {
+not get_force_snat_ip(lr_options, "lb").is_empty() or 
lb_force_snat_router_ip(lr_options)
 }
 
 /* For each router, collect the set of IPv4 and IPv6 addresses used for SNAT,
@@ -370,11 +370,11 @@ function force_snat_for_lb(lr: nb::Logical_Router): bool {
 relation LogicalRouterSnatIP(lr: uuid, snat_ip: v46_ip, nat: Option)
 LogicalRouterSnatIP(lr._uuid, force_snat_ip, None) :-
 lr in nb::Logical_Router(),
-var dnat_force_snat_ips = get_force_snat_ip(lr, "dnat"),
+var dnat_force_snat_ips = get_force_snat_ip(lr.options, "dnat"),
 var lb_force_snat_ips = if (lb_force_snat_router_ip(lr.options)) {
 set_empty()
 } else {
-get_force_snat_ip(lr, "lb")
+get_force_snat_ip(lr.options, "lb")
 },
 var force_snat_ip = FlatMap(dnat_force_snat_ips.union(lb_force_snat_ips)).
 LogicalRouterSnatIP(lr, snat_ip, Some{nat}) :-
@@ -418,7 +418,6 @@ LogicalRouterLBs(lr, vec_empty()) :-
 
 /* Router relation collects all attributes of a logical router.
  *
- * `lr` - Logical_Router record from the NB database
  * `l3dgw_port` - optional redirect port (see `DistributedGatewayPort`)
  * `redirect_port_name` - derived redirect port name (or empty string if
  *  router does not have a redirect port)
@@ -432,7 +431,18 @@ LogicalRouterLBs(lr, vec_empty()) :-
 function chassis_redirect_name(port_name: string): string = "cr-${port_name}"
 
 relation &Router(
-lr: nb::Logical_Router,
+/* Fields copied from nb::Logical_Router. */
+_uuid:  uuid,
+name:   string,
+static_routes:  Set,
+policies:   Set,
+enabled:Option,
+nat:Set,
+load_balancer:  Set,
+options:Map,
+external_ids:   Map,
+
+/* Additional computed fields. */
 l3dgw_port: Option,
 redirect_port_name: string,
 is_gateway: bool,
@@ -444,7 +454,16 @@ relation &Router(
 force_lb_snat: bool,
 )
 
-&Router(.lr = lr,
+&Router(._uuid =lr._uuid,
+.name  =lr.name,
+.static_routes =lr.static_routes,
+.policies  =lr.policies,
+.enabled   =lr.enabled,
+.nat   =lr.nat,
+.load_balancer =lr.load_balancer,
+.options   =lr.options,
+.external_ids  =lr.external_ids,
+
 .l3dgw_port = l3dgw_port,
 .redirect_port_name =
 match (l3dgw_port) {
@@ -576,7 +595,7 @@ relation &RouterPort(
 nb::Logical_Router_Port[lrp],
 Some{var networks} = extract

[ovs-dev] [PATCH v2 08/26] ovn-sbctl: Add daemon support.

2021-04-01 Thread Ben Pfaff
Also rewrite the manpage and convert it to XML for consistency with
ovn-nbctl, and add tests.

Signed-off-by: Ben Pfaff 
---
 NEWS  |   4 +-
 manpages.mk   |  17 -
 tests/ovn-sbctl.at|  76 +++--
 utilities/automake.mk |   7 +-
 utilities/ovn-dbctl.c |  24 +-
 utilities/ovn-dbctl.h |   3 +-
 utilities/ovn-nbctl.c |   1 +
 utilities/ovn-sbctl.8.in  | 317 --
 utilities/ovn-sbctl.8.xml | 580 +
 utilities/ovn-sbctl.c | 670 +++---
 10 files changed, 783 insertions(+), 916 deletions(-)
 delete mode 100644 utilities/ovn-sbctl.8.in
 create mode 100644 utilities/ovn-sbctl.8.xml

diff --git a/NEWS b/NEWS
index 8b170bcba6fb..a98529ac4ebe 100644
--- a/NEWS
+++ b/NEWS
@@ -7,7 +7,9 @@ Post-v21.03.0
 (This may take testing and tuning to be effective.)  This version of OVN
 requires DDLog 0.36.
   - Introduce ovn-controller incremetal processing engine statistics
-  - ovn-nbctl daemon mode is no longer considered experimental.
+  - Utilities:
+* ovn-nbctl daemon mode is no longer considered experimental.
+* ovn-sbctl now also supports daemon mode.
 
 OVN v21.03.0 - 12 Mar 2021
 -
diff --git a/manpages.mk b/manpages.mk
index 44e544681424..3334b38f943d 100644
--- a/manpages.mk
+++ b/manpages.mk
@@ -10,20 +10,3 @@ lib/common-syn.man:
 lib/common.man:
 lib/ovs.tmac:
 
-utilities/ovn-sbctl.8: \
-   utilities/ovn-sbctl.8.in \
-   lib/common.man \
-   lib/db-ctl-base.man \
-   lib/ovs.tmac \
-   lib/ssl-bootstrap.man \
-   lib/ssl.man \
-   lib/table.man \
-   lib/vlog.man
-utilities/ovn-sbctl.8.in:
-lib/common.man:
-lib/db-ctl-base.man:
-lib/ovs.tmac:
-lib/ssl-bootstrap.man:
-lib/ssl.man:
-lib/table.man:
-lib/vlog.man:
diff --git a/tests/ovn-sbctl.at b/tests/ovn-sbctl.at
index 2712cc15490c..9334762fd313 100644
--- a/tests/ovn-sbctl.at
+++ b/tests/ovn-sbctl.at
@@ -1,9 +1,14 @@
 AT_BANNER([ovn-sbctl])
 
+OVS_START_SHELL_HELPERS
 # OVN_SBCTL_TEST_START
 m4_define([OVN_SBCTL_TEST_START],
-  [dnl Create databases (ovn-nb, ovn-sb).
-   AT_KEYWORDS([ovn])
+  [AT_KEYWORDS([ovn])
+   AT_CAPTURE_FILE([ovsdb-server.log])
+   AT_CAPTURE_FILE([ovn-northd.log])
+   ovn_sbctl_test_start $1])
+ovn_sbctl_test_start() {
+   dnl Create databases (ovn-nb, ovn-sb).
for daemon in ovn-nb ovn-sb; do
   AT_CHECK([ovsdb-tool create $daemon.db 
$abs_top_srcdir/${daemon}.ovsschema])
done
@@ -15,27 +20,54 @@ m4_define([OVN_SBCTL_TEST_START],
AT_CHECK([[sed < stderr '
 /vlog|INFO|opened log file/d
 /ovsdb_server|INFO|ovsdb-server (Open vSwitch)/d']])
-   AT_CAPTURE_FILE([ovsdb-server.log])
 
dnl Start ovn-northd.
AT_CHECK([ovn-northd --detach --no-chdir --pidfile --log-file 
--ovnnb-db=unix:$OVS_RUNDIR/ovnnb_db.sock 
--ovnsb-db=unix:$OVS_RUNDIR/ovnsb_db.sock], [0], [], [stderr])
on_exit "kill `cat ovn-northd.pid`"
AT_CHECK([[sed < stderr '
 /vlog|INFO|opened log file/d']])
-   AT_CAPTURE_FILE([ovn-northd.log])
-])
+
+   AS_CASE([$1],
+ [daemon],
+   [export OVN_SB_DAEMON=$(ovn-sbctl --pidfile --detach --no-chdir 
--log-file -vsocket_util:off)
+on_exit "kill `cat ovn-sbctl.pid`"],
+ [direct], [],
+ [*], [AT_FAIL_IF(:)])
+}
 
 # OVN_SBCTL_TEST_STOP
-m4_define([OVN_SBCTL_TEST_STOP],
-  [AT_CHECK([check_logs "$1"])
-   OVS_APP_EXIT_AND_WAIT([ovn-northd])
-   OVS_APP_EXIT_AND_WAIT_BY_TARGET([$OVS_RUNDIR/ovnnb_db.ctl], 
[$OVS_RUNDIR/ovnnb_db.pid])
-   OVS_APP_EXIT_AND_WAIT_BY_TARGET([$OVS_RUNDIR/ovnsb_db.ctl], 
[$OVS_RUNDIR/ovnsb_db.pid])])
+m4_define([OVN_SBCTL_TEST_STOP], [ovn_sbctl_test_stop])
+ovn_sbctl_test_stop() {
+  AT_CHECK([check_logs "$1"])
+  OVS_APP_EXIT_AND_WAIT([ovn-northd])
+  OVS_APP_EXIT_AND_WAIT_BY_TARGET([$OVS_RUNDIR/ovnnb_db.ctl], 
[$OVS_RUNDIR/ovnnb_db.pid])
+  OVS_APP_EXIT_AND_WAIT_BY_TARGET([$OVS_RUNDIR/ovnsb_db.ctl], 
[$OVS_RUNDIR/ovnsb_db.pid])
+}
+OVS_END_SHELL_HELPERS
+
+# OVN_SBCTL_TEST(NAME, TITLE, COMMANDS)
+m4_define([OVN_SBCTL_TEST],
+   [OVS_START_SHELL_HELPERS
+$1() {
+  $3
+}
+OVS_END_SHELL_HELPERS
+
+AT_SETUP([ovn-sbctl - $2 - direct])
+OVN_SBCTL_TEST_START direct
+$1
+OVN_SBCTL_TEST_STOP
+AT_CLEANUP
+
+AT_SETUP([ovn-sbctl - $2 - daemon])
+OVN_SBCTL_TEST_START daemon
+$1
+OVN_SBCTL_TEST_STOP
+AT_CLEANUP])
 
 dnl -
 
-AT_SETUP([ovn-sbctl - chassis commands])
-OVN_SBCTL_TEST_START
+OVN_SBCTL_TEST([ovn_sbctl_chassis_commands], [ovn-sbctl - chassis commands], [
 ovn_init_db ovn-sb
 
 AT_CHECK([ovn-sbctl chassis-add ch0 geneve 1.2.3.4])
@@ -61,16 +93,14 @@ AT_CHECK([ovn-sbctl -f csv -d bare --no-headings --columns 
ip,type list encap |
 1.2.3.5,vxlan
 ])
 
-OVN_SBCTL_TEST_STOP
 as ovn-sb
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
-AT_CLEANUP
+as
+])
 
 dnl -
 
-AT_SETUP([ovn-sbctl])
-O

[ovs-dev] [PATCH v2 13/26] ovn-northd-ddlog: Intern the `Router` table.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

This is the first in a series of commits that will replace the use of
the DDlog's `Ref<>` type with `Intern<>` throughout the OVN code base.
`Ref` and `Intern` are the two forms of smart pointers supported by
DDlog at the moment.  `Ref` is a reference counted pointer.  Copying
a `Ref<>` simply increments its reference count.  `Intern<>` is an
interned object reference.  It guarantees that there exists exactly
one copy of each unique interned value.  Interned objects are slightly
more expensive to create, but they have several important advantages:
(1) they save memory by deduplicating identical values, (2) they allow
by-pointer comparisons, and (3) they avoid unnecessary recomputations
in some scenarios. See DDlog docs [1], [2] for more detail.

In this commit we change the type of records in the `Router` table from
`Ref` to `Intern`.  This reduces the amount of churn
and speeds up northd significantly in scenarios where the set of router
ports changes frequently, which triggers updates to
`nb::Logical_Router`, which in turn updates corresponding records
in the `Router` table.  Interning guarantees that these updates are
no-ops and do not trigger any other rules.

[1] 
https://github.com/vmware/differential-datalog/blob/master/doc/tutorial/tutorial.md#reference-type-ref
[2] 
https://github.com/vmware/differential-datalog/blob/master/doc/tutorial/tutorial.md#interned-values-intern-istring

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 northd/lrouter.dl| 21 -
 northd/multicast.dl  |  6 +++---
 northd/ovn_northd.dl | 31 +++
 3 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index 574926b73b67..b2b429af3c96 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -430,7 +430,7 @@ LogicalRouterLBs(lr, vec_empty()) :-
 
 function chassis_redirect_name(port_name: string): string = "cr-${port_name}"
 
-relation &Router(
+typedef Router = Router {
 /* Fields copied from nb::Logical_Router. */
 _uuid:  uuid,
 name:   string,
@@ -452,9 +452,12 @@ relation &Router(
 mcast_cfg:  Ref,
 learn_from_arp_request: bool,
 force_lb_snat: bool,
-)
+}
+
+relation Router[Intern]
 
-&Router(._uuid =lr._uuid,
+Router[Router{
+._uuid =lr._uuid,
 .name  =lr.name,
 .static_routes =lr.static_routes,
 .policies  =lr.policies,
@@ -476,7 +479,7 @@ relation &Router(
 .lbs= lbs,
 .mcast_cfg  = mcast_cfg,
 .learn_from_arp_request = learn_from_arp_request,
-.force_lb_snat = force_lb_snat) :-
+.force_lb_snat = force_lb_snat}.intern()] :-
 lr in nb::Logical_Router(),
 lr.is_enabled(),
 LogicalRouterRedirectPort(lr._uuid, l3dgw_port),
@@ -488,7 +491,7 @@ relation &Router(
 var force_lb_snat = lb_force_snat_router_ip(lr.options).
 
 /* RouterLB: many-to-many relation between logical routers and nb::LB */
-relation RouterLB(router: Ref, lb: Ref)
+relation RouterLB(router: Intern, lb: Ref)
 
 RouterLB(router, lb) :-
 router in &Router(.lbs = lbs),
@@ -496,7 +499,7 @@ RouterLB(router, lb) :-
 
 /* Load balancer VIPs associated with routers */
 relation RouterLBVIP(
-router: Ref,
+router: Intern,
 lb: Ref,
 vip: string,
 backends: string)
@@ -576,7 +579,7 @@ relation &RouterPort(
 lrp:  nb::Logical_Router_Port,
 json_name:string,
 networks: lport_addresses,
-router:   Ref,
+router:   Intern,
 is_redirect:  bool,
 peer: RouterPeer,
 mcast_cfg:Ref,
@@ -711,7 +714,7 @@ function find_lrp_member_ip(networks: lport_addresses, ip: 
v46_ip): Option,
+router  : Intern,
 key : route_key,
 nexthop : v46_ip,
 output_port : Option,
@@ -735,7 +738,7 @@ typedef route_dst = RouteDst {
 }
 
 relation RouterStaticRoute(
-router  : Ref,
+router  : Intern,
 key : route_key,
 dsts: Set)
 
diff --git a/northd/multicast.dl b/northd/multicast.dl
index 990203bffe25..9b0fa80738d7 100644
--- a/northd/multicast.dl
+++ b/northd/multicast.dl
@@ -160,7 +160,7 @@ SwitchMcastFloodReportPorts(switch, set_empty()) :-
 /* Mapping between Router and the set of port uuids on which to
  * flood IP multicast reports statically.
  */
-relation RouterMcastFloodPorts(sw: Ref, ports: Set)
+relation RouterMcastFloodPorts(sw: Intern, ports: Set)
 
 RouterMcastFloodPorts(router, flood_ports) :-
 &RouterPort(
@@ -213,7 +213,7 @@ IgmpSwitchMulticastGroup(address, switch, ports) :-
  */
 relation IgmpRouterGroupPort(
 address: string,
-router : Ref,
+router : Intern,
 port   : uuid
 )
 
@@ -236,7 +236,7 @@ IgmpRouterGroupPort(address, rtr_port.router, 
rtr_port.lrp._uuid) :-
  */
 relation IgmpRouterMulticastGroup(
 address: string,
-router : 

[ovs-dev] [PATCH v2 06/26] ovn-nbctl: Refactor into infrastructure and northbound details.

2021-04-01 Thread Ben Pfaff
In an upcoming commit, this will allow adding daemon mode to ovn-sbctl
without having a lot of duplicated code.

Signed-off-by: Ben Pfaff 
---
 utilities/automake.mk |5 +-
 utilities/ovn-dbctl.c | 1214 
 utilities/ovn-dbctl.h |   60 ++
 utilities/ovn-nbctl.c | 1366 -
 4 files changed, 1411 insertions(+), 1234 deletions(-)
 create mode 100644 utilities/ovn-dbctl.c
 create mode 100644 utilities/ovn-dbctl.h

diff --git a/utilities/automake.mk b/utilities/automake.mk
index c4a6d248c274..50c0cfded018 100644
--- a/utilities/automake.mk
+++ b/utilities/automake.mk
@@ -71,7 +71,10 @@ utilities/ovn-lib: $(top_builddir)/config.status
 
 # ovn-nbctl
 bin_PROGRAMS += utilities/ovn-nbctl
-utilities_ovn_nbctl_SOURCES = utilities/ovn-nbctl.c
+utilities_ovn_nbctl_SOURCES = \
+utilities/ovn-dbctl.c \
+utilities/ovn-dbctl.h \
+utilities/ovn-nbctl.c
 utilities_ovn_nbctl_LDADD = lib/libovn.la $(OVSDB_LIBDIR)/libovsdb.la 
$(OVS_LIBDIR)/libopenvswitch.la
 
 # ovn-sbctl
diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
new file mode 100644
index ..28ebc6267066
--- /dev/null
+++ b/utilities/ovn-dbctl.c
@@ -0,0 +1,1214 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+
+#include "ovn-dbctl.h"
+
+#include 
+
+#include "command-line.h"
+#include "daemon.h"
+#include "db-ctl-base.h"
+#include "fatal-signal.h"
+#include "jsonrpc.h"
+#include "memory.h"
+#include "openvswitch/poll-loop.h"
+#include "openvswitch/vlog.h"
+#include "ovn-util.h"
+#include "ovsdb-idl.h"
+#include "process.h"
+#include "simap.h"
+#include "stream-ssl.h"
+#include "svec.h"
+#include "table.h"
+#include "timer.h"
+#include "unixctl.h"
+#include "util.h"
+
+VLOG_DEFINE_THIS_MODULE(ovn_dbctl);
+
+/* --db: The database server to contact. */
+static const char *db;
+
+/* --oneline: Write each command's output as a single line? */
+static bool oneline;
+
+/* --dry-run: Do not commit any changes. */
+static bool dry_run;
+
+/* --wait=TYPE: Wait for configuration change to take effect? */
+static enum nbctl_wait_type wait_type = NBCTL_WAIT_NONE;
+
+static bool print_wait_time = false;
+
+/* --timeout: Time to wait for a connection to 'db'. */
+static unsigned int timeout;
+
+/* Format for table output. */
+static struct table_style table_style = TABLE_STYLE_DEFAULT;
+
+/* The IDL we're using and the current transaction, if any.  This is for use by
+ * ovn_dbctl_exit() only, to allow it to clean up.  Other code should use its
+ * context arguments. */
+static struct ovsdb_idl *the_idl;
+static struct ovsdb_idl_txn *the_idl_txn;
+
+/* --leader-only, --no-leader-only: Only accept the leader in a cluster. */
+static int leader_only = true;
+
+/* --shuffle-remotes, --no-shuffle-remotes: Shuffle the order of remotes that
+ * are specified in the connetion method string. */
+static int shuffle_remotes = true;
+
+/* --unixctl-path: Path to use for unixctl server socket, for daemon mode. */
+static char *unixctl_path;
+
+static unixctl_cb_func server_cmd_exit;
+static unixctl_cb_func server_cmd_run;
+
+static struct option *get_all_options(void);
+static bool has_option(const struct ovs_cmdl_parsed_option *, size_t n,
+   int option);
+static void dbctl_client(const struct ovn_dbctl_options *dbctl_options,
+ const char *socket_name,
+ const struct ovs_cmdl_parsed_option *, size_t n,
+ int argc, char *argv[]);
+static bool will_detach(const struct ovs_cmdl_parsed_option *, size_t n);
+static void apply_options_direct(const struct ovn_dbctl_options *dbctl_options,
+ const struct ovs_cmdl_parsed_option *,
+ size_t n, struct shash *local_options);
+static char * OVS_WARN_UNUSED_RESULT run_prerequisites(
+const struct ovn_dbctl_options *dbctl_options,
+struct ctl_command[], size_t n_commands, struct ovsdb_idl *);
+static char * OVS_WARN_UNUSED_RESULT do_dbctl(
+const struct ovn_dbctl_options *dbctl_options,
+const char *args, struct ctl_command *, size_t n,
+struct ovsdb_idl *, const struct timer *, bool *retry);
+static char * OVS_WARN_UNUSED_RESULT main_loop(
+const struct ovn_dbctl_options *, const char *args,
+struct ctl_command *commands, size_t n_commands,
+struct ovsdb_idl *idl, const struct timer *);
+static void server_loop(const struct ovn_db

[ovs-dev] [PATCH v2 11/26] ovn-northd-ddlog: Preserve NB_Global more carefully.

2021-04-01 Thread Ben Pfaff
Dumitru reported in #openvswitch that ovn-northd-ddlog can discard the
setting of NB_Global.options:use_logical_dp_groups at startup.  I think
that this must be because it seems possible that at startup some of the
relations in the Out_NB_Global rule aren't populated yet, and yet
there is still a row in nb::NB_Global, so that neither rule for
Out_NB_Global matches and therefore ovn-northd-ddlog deletes the row.

This commit changes the structure of how ovn-northd-ddlog generates
Out_NB_Global to ensure that, if there's an input row, it always
generates exactly one output row.  This should be more robust than the
previous version regardless of whether it fixes the exact problem
that Dumitru observed (which I did not try to reproduce).

Reported-by: Dumitru Ceara 
Signed-off-by: Ben Pfaff 
---
 northd/ovn_northd.dl | 63 ++--
 1 file changed, 37 insertions(+), 26 deletions(-)

diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
index 0063021e13f5..d718425b7de3 100644
--- a/northd/ovn_northd.dl
+++ b/northd/ovn_northd.dl
@@ -625,21 +625,25 @@ HvCfgTimestamp(hv_cfg_timestamp) :-
 not HvCfgTimestamp0().
 
 /*
- * NB_Global:
- * - set `sb_cfg` to the value of `SB_Global.nb_cfg`.
- * - set `hv_cfg` to the smallest value of `nb_cfg` across all `Chassis`
- * - FIXME: we use ipsec as unique key to make sure that we don't create 
multiple `NB_Global`
- *   instance.  There is a potential race condition if this field is modified 
at the same
- *   time northd is updating `sb_cfg` or `hv_cfg`.
+ * nb::Out_NB_Global.
+ *
+ * OutNBGlobal0 generates the new record in the common case.
+ * OutNBGlobal1 generates the new record as a copy of nb::NB_Global, if 
sb::SB_Global is missing.
+ * nb::Out_NB_Global makes sure we have only a single record in the relation.
+ *
+ * (We don't generate an NB_Global output record if there isn't
+ * one in the input.  We don't have enough entropy available to
+ * generate a random _uuid.  Doesn't seem like a big deal, because
+ * OVN probably hasn't really been initialized yet.)
  */
-input relation NbCfgTimestamp[integer]
-nb::Out_NB_Global(._uuid = _uuid,
- .sb_cfg= sb_cfg,
- .hv_cfg= hv_cfg,
- .nb_cfg_timestamp = nb_cfg_timestamp,
- .hv_cfg_timestamp = hv_cfg_timestamp,
- .ipsec = ipsec,
- .options   = options) :-
+relation OutNBGlobal0[nb::Out_NB_Global]
+OutNBGlobal0[nb::Out_NB_Global{._uuid = _uuid,
+   .sb_cfg= sb_cfg,
+   .hv_cfg= hv_cfg,
+   .nb_cfg_timestamp = nb_cfg_timestamp,
+   .hv_cfg_timestamp = hv_cfg_timestamp,
+   .ipsec = ipsec,
+   .options   = options}] :-
 NbCfgTimestamp[nb_cfg_timestamp],
 HvCfgTimestamp(hv_cfg_timestamp),
 nbg in nb::NB_Global(._uuid = _uuid, .ipsec = ipsec),
@@ -654,19 +658,26 @@ nb::Out_NB_Global(._uuid = _uuid,
 var options2 = options1.insert_imm("max_tunid", "${max_tunid}"),
 var options = options2.insert_imm("northd_internal_version", 
ovn_internal_version()).
 
+relation OutNBGlobal1[nb::Out_NB_Global]
+OutNBGlobal1[x] :- OutNBGlobal0[x].
+OutNBGlobal1[nb::Out_NB_Global{._uuid = nbg._uuid,
+   .sb_cfg= nbg.sb_cfg,
+   .hv_cfg= nbg.hv_cfg,
+   .ipsec = nbg.ipsec,
+   .options   = nbg.options,
+   .nb_cfg_timestamp = nbg.nb_cfg_timestamp,
+   .hv_cfg_timestamp = nbg.hv_cfg_timestamp}] :-
+Unit(),
+not OutNBGlobal0[_],
+nbg in nb::NB_Global().
+
+nb::Out_NB_Global[y] :-
+OutNBGlobal1[x],
+var y = x.group_by(()).group_first().
 
-/* SB_Global does not exist yet -- just keep the old value of NB_Global */
-nb::Out_NB_Global(._uuid = nbg._uuid,
- .sb_cfg= nbg.sb_cfg,
- .hv_cfg= nbg.hv_cfg,
- .ipsec = nbg.ipsec,
- .options   = nbg.options,
- .nb_cfg_timestamp = nb_cfg_timestamp,
- .hv_cfg_timestamp = hv_cfg_timestamp) :-
-NbCfgTimestamp[nb_cfg_timestamp],
-HvCfgTimestamp(hv_cfg_timestamp),
-nbg in nb::NB_Global(),
-not sb::SB_Global().
+// Tracks the value that should go into NB_Global's 'nb_cfg_timestamp' column.
+// ovn-northd-ddlog.c pushes the current time directly into this relation.
+input relation NbCfgTimestamp[integer]
 
 output relation SbCfg[integer]
 SbCfg[sb_cfg] :- nb::Out_NB_Global(.sb_cfg = sb_cfg).
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 10/26] ovn-northd-ddlog: Upgrade to ddlog 0.38.

2021-04-01 Thread Ben Pfaff
From: Leonid Ryzhyk 

Upcoming commits will use a new --intern-table option of ovsdb2ddlog,
so we need to upgrade to the version of ddlog that has that feature.

To do so, we need to adapt the code to language changes in ddlog.  This
commit does that for a change in 0.37 in which, when iterating over a
`Group` in a for-loop, the iterator returns `(value, weight)` tuples.

This also adapts ovn-northd-ddlog.c to a slightly updated C API.

Signed-off-by: Leonid Ryzhyk 
Signed-off-by: Ben Pfaff 
---
 NEWS  |  2 +-
 northd/lrouter.dl |  2 +-
 northd/ovn-northd-ddlog.c | 21 +++--
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/NEWS b/NEWS
index a98529ac4ebe..a75c44133b06 100644
--- a/NEWS
+++ b/NEWS
@@ -5,7 +5,7 @@ Post-v21.03.0
 needed for the southbound database when northbound changes occur.  It is
 expected to scale better than the C implementation, for large deployments.
 (This may take testing and tuning to be effective.)  This version of OVN
-requires DDLog 0.36.
+requires DDLog 0.38.
   - Introduce ovn-controller incremetal processing engine statistics
   - Utilities:
 * ovn-nbctl daemon mode is no longer considered experimental.
diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index 36cedd2dc219..e3afff72f41d 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -382,7 +382,7 @@ LogicalRouterSnatIP(lr, snat_ip, Some{nat}) :-
 
 function group_to_setunionmap(g: Group<'K1, ('K2,Set<'V>)>): Map<'K2,Set<'V>> {
 var map = map_empty();
-for (entry in g) {
+for ((entry, _) in g) {
 (var key, var value) = entry;
 match (map.get(key)) {
 None -> map.insert(key, value),
diff --git a/northd/ovn-northd-ddlog.c b/northd/ovn-northd-ddlog.c
index ca1ab325448c..74f0eaccd5bb 100644
--- a/northd/ovn-northd-ddlog.c
+++ b/northd/ovn-northd-ddlog.c
@@ -79,10 +79,11 @@ static table_id WARNING_TABLE_ID;
 static table_id NB_CFG_TIMESTAMP_ID;
 
 /* Initialize frequently used table ids. */
-static void init_table_ids(void)
+static void
+init_table_ids(ddlog_prog ddlog)
 {
-WARNING_TABLE_ID = ddlog_get_table_id("helpers::Warning");
-NB_CFG_TIMESTAMP_ID = ddlog_get_table_id("NbCfgTimestamp");
+WARNING_TABLE_ID = ddlog_get_table_id(ddlog, "helpers::Warning");
+NB_CFG_TIMESTAMP_ID = ddlog_get_table_id(ddlog, "NbCfgTimestamp");
 }
 
 struct northd_ctx {
@@ -347,7 +348,8 @@ ddlog_clear(struct northd_ctx *ctx)
 int n_failures = 0;
 for (int i = 0; ctx->input_relations[i]; i++) {
 char *table = xasprintf("%s%s", ctx->prefix, ctx->input_relations[i]);
-if (ddlog_clear_relation(ctx->ddlog, ddlog_get_table_id(table))) {
+if (ddlog_clear_relation(ctx->ddlog, ddlog_get_table_id(ctx->ddlog,
+table))) {
 n_failures++;
 }
 free(table);
@@ -611,7 +613,7 @@ northd_update_probe_interval(struct northd_ctx *nb, struct 
northd_ctx *sb)
  * Any other value is an explicit probe interval request from the
  * database. */
 int probe_interval = 0;
-table_id tid = ddlog_get_table_id("Northd_Probe_Interval");
+table_id tid = ddlog_get_table_id(nb->ddlog, "Northd_Probe_Interval");
 ddlog_delta *probe_delta = ddlog_delta_remove_table(nb->delta, tid);
 ddlog_delta_enumerate(probe_delta, northd_update_probe_interval_cb, 
(uintptr_t) &probe_interval);
 ddlog_free_delta(probe_delta);
@@ -670,7 +672,7 @@ ddlog_table_update_output(struct ds *ds, ddlog_prog ddlog, 
ddlog_delta *delta,
 return;
 }
 char *table_name = xasprintf("%s::Out_%s", db, table);
-ddlog_delta_clear_table(delta, ddlog_get_table_id(table_name));
+ddlog_delta_clear_table(delta, ddlog_get_table_id(ddlog, table_name));
 free(table_name);
 
 if (!updates[0]) {
@@ -948,7 +950,7 @@ get_database_ops(struct northd_ctx *ctx)
  * We require output-only tables to have an accompanying index
  * named _Index. */
 char *index = xasprintf("%s_Index", table);
-index_id idxid = ddlog_get_index_id(index);
+index_id idxid = ddlog_get_index_id(ctx->ddlog, index);
 if (idxid == -1) {
 VLOG_WARN_RL(&rl, "%s: unknown index", index);
 free(index);
@@ -1000,7 +1002,7 @@ get_database_ops(struct northd_ctx *ctx)
 static int64_t old_sb_cfg_timestamp = INT64_MIN;
 int64_t new_sb_cfg = old_sb_cfg;
 if (ctx->has_timestamp_columns) {
-table_id sb_cfg_tid = ddlog_get_table_id("SbCfg");
+table_id sb_cfg_tid = ddlog_get_table_id(ctx->ddlog, "SbCfg");
 ddlog_delta *sb_cfg_delta = ddlog_delta_remove_table(ctx->delta,
  sb_cfg_tid);
 ddlog_delta_enumerate(sb_cfg_delta, northd_update_sb_cfg_cb,
@@ -1149,8 +1151,6 @@ main(int argc, char *argv[])
 int retval;
 bool exiting;
 
-init_table_ids

[ovs-dev] [PATCH v2 09/26] tests: Miscellaneous debuggability improvements.

2021-04-01 Thread Ben Pfaff
Signed-off-by: Ben Pfaff 
---
 tests/ovn.at | 67 ++--
 1 file changed, 49 insertions(+), 18 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index 391a8bcd9323..7b6789125ffc 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -9467,9 +9467,10 @@ wait_for_ports_up
 check ovn-nbctl --wait=hv sync
 as hv1 ovs-vsctl show
 
-echo "*"
-ovn-sbctl list DNS
-echo "*"
+ovn-sbctl list DNS > dns
+AT_CAPTURE_FILE([dns])
+ovn-sbctl dump-flows > sbflows
+AT_CAPTURE_FILE([sbflows])
 
 reset_pcap_file() {
 local iface=$1
@@ -9582,7 +9583,13 @@ test_dns() {
 echo $request >> $outport.expected
 done
 fi
-as hv1 ovs-appctl netdev-dummy/receive hv1-vif$inport $request
+if true; then
+as hv1 ovs-appctl ofproto/trace br-int in_port=hv1-vif$inport $request 
> trace$trace
+trace=$(expr $trace + 1)
+else
+as hv1 ovs-appctl dpctl/del-flows
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif$inport $request
+fi
 }
 
 test_dns6() {
@@ -9614,7 +9621,13 @@ test_dns6() {
 echo $request >> $outport.expected
 done
 fi
-as hv1 ovs-appctl netdev-dummy/receive hv1-vif$inport $request
+if true; then
+as hv1 ovs-appctl ofproto/trace br-int in_port=hv1-vif$inport $request 
> trace$trace
+trace=$(expr $trace + 1)
+else
+as hv1 ovs-appctl dpctl/del-flows
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif$inport $request
+fi
 }
 
 AT_CAPTURE_FILE([ofctl_monitor0.log])
@@ -9663,8 +9676,7 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Try vm1 again but an all-caps query name
-
+AS_BOX([Try vm1 again but an all-caps query name])
 set_dns_params VM1
 src_ip=`ip_to_hex 10 0 0 6`
 dst_ip=`ip_to_hex 10 0 0 1`
@@ -9686,8 +9698,12 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Clear the query name options for ls1-lp2
+AS_BOX([Clear the query name options for ls1-lp2])
 ovn-nbctl --wait=hv remove DNS $DNS1 records vm2.ovn.org
+ovn-sbctl list DNS > dns2
+AT_CAPTURE_FILE([dns2])
+ovn-sbctl dump-flows > sbflows2
+AT_CAPTURE_FILE([sbflows2])
 
 set_dns_params vm2
 src_ip=`ip_to_hex 10 0 0 4`
@@ -9706,10 +9722,14 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Clear the query name for ls1-lp1
+AS_BOX([Clear the query name for ls1-lp1])
 # Since ls1 has no query names configued,
 # ovn-northd should not add the DNS flows.
 ovn-nbctl --wait=hv remove DNS $DNS1 records vm1.ovn.org
+ovn-sbctl list DNS > dns3
+AT_CAPTURE_FILE([dns3])
+ovn-sbctl dump-flows > sbflows3
+AT_CAPTURE_FILE([sbflows3])
 
 set_dns_params vm1
 src_ip=`ip_to_hex 10 0 0 6`
@@ -9728,9 +9748,13 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Test IPv6 ( records) using IPv4 packet.
+AS_BOX([Test IPv6 ( records) using IPv4 packet.])
 # Add back the DNS options for ls1-lp1.
 ovn-nbctl --wait=hv set DNS $DNS1 records:vm1.ovn.org="10.0.0.4 aef0::4"
+ovn-sbctl list DNS > dns4
+AT_CAPTURE_FILE([dns4])
+ovn-sbctl dump-flows > sbflows4
+AT_CAPTURE_FILE([sbflows4])
 
 set_dns_params vm1_ipv6_only
 src_ip=`ip_to_hex 10 0 0 6`
@@ -9753,7 +9777,7 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Test both IPv4 (A) and IPv6 ( records) using IPv4 packet.
+AS_BOX([Test both IPv4 (A) and IPv6 ( records) using IPv4 packet.])
 set_dns_params vm1_ipv4_v6
 src_ip=`ip_to_hex 10 0 0 6`
 dst_ip=`ip_to_hex 10 0 0 1`
@@ -9775,7 +9799,7 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Invalid type.
+AS_BOX([Invalid type])
 set_dns_params vm1_invalid_type
 src_ip=`ip_to_hex 10 0 0 6`
 dst_ip=`ip_to_hex 10 0 0 1`
@@ -9793,7 +9817,7 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Incomplete DNS packet.
+AS_BOX([Incomplete DNS packet])
 set_dns_params vm1_incomplete
 src_ip=`ip_to_hex 10 0 0 6`
 dst_ip=`ip_to_hex 10 0 0 1`
@@ -9811,8 +9835,12 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Add one more DNS record to the ls1.
+AS_BOX([Add one more DNS record to the ls1])
 ovn-nbctl --wait=hv set Logical_switch ls1 dns_records="$DNS1 $DNS2"
+ovn-sbctl list DNS > dns5
+AT_CAPTURE_FILE([dns5])
+ovn-sbctl dump-flows > sbflows5
+AT_CAPTURE_FILE([sbflows5])
 
 set_dns_params vm3
 src_ip=`ip_to_hex 10 0 0 4`
@@ -9835,7 +9863,7 @@ reset_pcap_file hv1-vif2 hv1/vif2
 rm -f 1.expected
 rm -f 2.expected
 
-# Try DNS query over IPv6
+AS_BOX([Try DNS query over IPv6])
 set_dns_params vm1
 src_ip=aef4
 dst_ip=aef1
@@ -10953,10 +10981,10 @@ check ovn-nbctl --wait=hv sync
 
 # Check that there is a logical flow in logical switch foo's pipeline
 # to set the outport to rp-foo with the condition is_chassis_redirect.
-ovn-sbctl dump-flows foo > sbflows
+OVS_WAIT_UNTIL([ovn-sbctl dump-flows foo > sbflows
+t

[ovs-dev] [PATCH v2 07/26] ovn-dbctl: Fix memory leak in client mode.

2021-04-01 Thread Ben Pfaff
This isn't notable, since this commit frees it just before exiting, but
it cleans up the Address Sanitizer report.

Signed-off-by: Ben Pfaff 
---
 utilities/ovn-dbctl.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
index 28ebc6267066..d815dc5c8c5f 100644
--- a/utilities/ovn-dbctl.c
+++ b/utilities/ovn-dbctl.c
@@ -1210,5 +1210,9 @@ dbctl_client(const struct ovn_dbctl_options 
*dbctl_options,
 free(cmd_result);
 free(cmd_error);
 jsonrpc_close(client);
+for (int i = 0; i < argc; i++) {
+free(argv[i]);
+}
+free(argv);
 exit(exit_status);
 }
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 03/26] ovn-nbctl: Improve manpage.

2021-04-01 Thread Ben Pfaff
This rearranges the manpage into a more logical order, documents some
options that weren't documented, adds some sections such as
Environment and Exit Status that a manpage should have, puts the
headings at reasonable levels instead of all at the top level, and adds
a little more explanatory text in a few places.

Signed-off-by: Ben Pfaff 
---
 utilities/ovn-nbctl.8.xml | 670 ++
 1 file changed, 392 insertions(+), 278 deletions(-)

diff --git a/utilities/ovn-nbctl.8.xml b/utilities/ovn-nbctl.8.xml
index 03d47dba5b72..39f9381fdaae 100644
--- a/utilities/ovn-nbctl.8.xml
+++ b/utilities/ovn-nbctl.8.xml
@@ -7,9 +7,327 @@
 ovn-nbctl [options] command 
[arg...]
 
 Description
-This utility can be used to manage the OVN northbound database.
 
-General Commands
+
+  The ovn-nbctl program configures the
+  OVN_Northbound database by providing a high-level interface
+  to its configuration database.  See ovn-nb(5) for
+  comprehensive documentation of the database schema.
+
+
+
+  ovn-nbctl connects to an ovsdb-server process
+  that maintains an OVN_Northbound configuration database.  Using this
+  connection, it queries and possibly applies changes to the database,
+  depending on the supplied commands.
+
+
+
+  ovn-nbctl can perform any number of commands in a single
+  run, implemented as a single atomic transaction against the database.
+
+
+
+  The ovn-nbctl command line begins with global options (see
+  OPTIONS below for details).  The global options are followed
+  by one or more commands.  Each command should begin with --
+  by itself as a command-line argument, to separate it from the following
+  commands.  (The -- before the first command is optional.)
+  The command itself starts with command-specific options, if any, followed
+  by the command name and any arguments.
+
+
+Daemon Mode
+
+
+  When it is invoked in the most ordinary way, ovn-nbctl
+  connects to an OVSDB server that hosts the northbound database, retrieves
+  a partial copy of the database that is complete enough to do its work,
+  sends a transaction request to the server, and receives and processes the
+  server's reply.  In common interactive use, this is fine, but if the
+  database is large, the step in which ovn-nbctl retrieves a
+  partial copy of the database can take a long time, which yields poor
+  performance overall.
+
+
+
+  To improve performance in such a case, ovn-nbctl offers a
+  "daemon mode," in which the user first starts ovn-nbctl
+  running in the background and afterward uses the daemon to execute
+  operations.  Over several ovn-nbctl command invocations,
+  this performs better overall because it retrieves a copy of the database
+  only once at the beginning, not once per program run.
+
+
+
+  Use the --detach option to start an ovn-nbctl
+  daemon.  With this option, ovn-nbctl prints the name of a
+  control socket to stdout.  The client should save this name in
+  environment variable OVN_NB_DAEMON.  Under the Bourne shell
+  this might be done like this:
+
+
+
+  export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
+
+
+
+  When OVN_NB_DAEMON is set, ovn-nbctl
+  automatically and transparently uses the daemon to execute its commands.
+
+
+
+  When the daemon is no longer needed, kill it and unset the environment
+  variable, e.g.:
+
+
+
+  kill $(cat $OVN_RUNDIR/ovn-nbctl.pid)
+  unset OVN_NB_DAEMON
+
+
+
+  When using daemon mode, an alternative to the OVN_NB_DAEMON
+  environment variable is to specify a path for the Unix socket. When
+  starting the ovn-nbctl daemon, specify the -u option with a
+  full path to the location of the socket file. Here is an exmple:
+
+
+
+  ovn-nbctl --detach -u /tmp/mysock.ctl
+
+
+
+  Then to connect to the running daemon, use the -u option
+  with the full path to the socket created when the daemon was started:
+
+
+
+  ovn-nbctl -u /tmp/mysock.ctl show
+
+
+
+  Daemon mode is experimental.
+
+
+Daemon Commands
+
+
+  Daemon mode is internally implemented using the same mechanism used by
+  ovs-appctl.  One may also use ovs-appctl
+  directly with the following commands:
+
+
+
+  
+run [options] command
+[arg...] [-- [options]
+command [arg...] ...]
+  
+  
+Instructs the daemon process to run one or more ovn-nbctl
+commands described above and reply with the results of running these
+commands. Accepts the --no-wait, --wait,
+--timeout, --dry-run, --oneline,
+and the options described under Table Formatting Options
+in addition to the the command-specific options.
+  
+
+

[ovs-dev] [PATCH v2 05/26] ovn-nbctl: Daemon mode is no longer experimental.

2021-04-01 Thread Ben Pfaff
Mark says that it was heavily used by ovn-kubernetes for years without
any issues.

Signed-off-by: Ben Pfaff 
Suggested-by: Mark Michelson 
---
 NEWS  | 1 +
 utilities/ovn-nbctl.8.xml | 4 
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 530c5d42fe85..8b170bcba6fb 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,7 @@ Post-v21.03.0
 (This may take testing and tuning to be effective.)  This version of OVN
 requires DDLog 0.36.
   - Introduce ovn-controller incremetal processing engine statistics
+  - ovn-nbctl daemon mode is no longer considered experimental.
 
 OVN v21.03.0 - 12 Mar 2021
 -
diff --git a/utilities/ovn-nbctl.8.xml b/utilities/ovn-nbctl.8.xml
index 73a0ee25fceb..afe3874e6d03 100644
--- a/utilities/ovn-nbctl.8.xml
+++ b/utilities/ovn-nbctl.8.xml
@@ -106,10 +106,6 @@
   ovn-nbctl -u /tmp/mysock.ctl show
 
 
-
-  Daemon mode is experimental.
-
-
 Daemon Commands
 
 
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 00/26] ddlog 5x performance improvement

2021-04-01 Thread Ben Pfaff
This series of patches greatly improves the performance of
ovn-northd-ddlog with the benchmark added in the final patch.  The
first three patches improve both the benchmark for both versions of
ovn-northd.

Here are the timings that I measure in each case.  All of them
include the benefit of the first three patches.  Without those
patches, the C version takes over 500 seconds and the other take much
longer too; the relative timings aren't affected much, it's just all
slower:

C: 106.8s (0.135s ... 1.043s)
ddlog before optimization patches: 176.0s (0.128s ... 1.804s)
ddlog after optimization patches:   35.2s (0.129s ... 0.147s)

v1->v2:
  - Don't remove --output-only-table option from ovsdb2ddlog2c
in "ovn-northd-ddlog: Intern selected input relations.".
  - New patches "ovn-nbctl: Daemon mode is no longer experimental."
and "ovn-nbctl: Recommend ovn-appctl instead of ovs-appctl."
and make similar changes to new ovn-sbctl manpage.
  - Update ovn-sbctl and ovn-nbctl manpages to reference ovn-appctl
manpage.
  - Various trivial changes suggested by checkpatch.
  - New patches "ovn-nbctl: Fix memory leak in client mode."
and "ovn-northd-ddlog: Fix two memory leaks." fix memory leaks
reported by Numan and found by Address Sanitizer.
  - Fix bug introduced into ovsdb2ddlog2c in "ovn-northd-ddlog: Intern
selected input relations."

Ben Pfaff (11):
  ovn-northd-ddlog: Fix two memory leaks.
  ovn-nbctl: Fix memory leak in client mode.
  ovn-nbctl: Improve manpage.
  ovn-nbctl: Recommend ovn-appctl instead of ovs-appctl.
  ovn-nbctl: Daemon mode is no longer experimental.
  ovn-nbctl: Refactor into infrastructure and northbound details.
  ovn-dbctl: Fix memory leak in client mode.
  ovn-sbctl: Add daemon support.
  tests: Miscellaneous debuggability improvements.
  ovn-northd-ddlog: Preserve NB_Global more carefully.
  tutorial: Add benchmarking test script to run within sandbox.

Leonid Ryzhyk (15):
  ovn-northd-ddlog: Upgrade to ddlog 0.38.
  ovn-northd-ddlog: Remove `lr` field from `Router`.
  ovn-northd-ddlog: Intern the `Router` table.
  ovn-northd-ddlog: Workaround for slow group_by.
  ovn-northd-ddlog: Intern the Switch table.
  ovn-northd-ddlog: Remove `ls` field from `Switch`.
  ovn-northd-ddlog: Intern the SwitchPort table.
  ovn-northd-ddlog: Intern the RouterPort table.
  ovn-northd-ddlog: Remove unused function.
  ovn-northd-ddlog: Eliminate remaining Ref's.
  ovn-northd-ddlog: Eliminate redundant dereferences.
  ovn-northd-ddlog: Intern selected input relations.
  ovn-northd-ddlog: Intern nb::Logical_Router_Port.
  ovn-northd-ddlog: Intern nb::Logical_Switch_Port.
  ovn-northd-ddlog: Remove Router.static_routes.

 NEWS  |5 +-
 manpages.mk   |   17 -
 northd/helpers.dl |   40 +-
 northd/ipam.dl|   61 +-
 northd/lrouter.dl |  188 +++--
 northd/lswitch.dl |  243 +++---
 northd/multicast.dl   |   77 +-
 northd/ovn-nb.dlopts  |   10 +
 northd/ovn-northd-ddlog.c |   23 +-
 northd/ovn-sb.dlopts  |1 +
 northd/ovn_northd.dl  | 1065 +-
 northd/ovsdb2ddlog2c  |4 +-
 tests/ovn-sbctl.at|   76 +-
 tests/ovn.at  |   67 +-
 tutorial/automake.mk  |3 +-
 tutorial/northd_ddlog_test.sh |   81 ++
 utilities/automake.mk |   12 +-
 utilities/ovn-dbctl.c | 1230 +
 utilities/ovn-dbctl.h |   61 ++
 utilities/ovn-nbctl.8.xml |  667 +---
 utilities/ovn-nbctl.c | 1363 -
 utilities/ovn-sbctl.8.in  |  317 
 utilities/ovn-sbctl.8.xml |  580 ++
 utilities/ovn-sbctl.c |  670 +++-
 24 files changed, 3599 insertions(+), 3262 deletions(-)
 create mode 100755 tutorial/northd_ddlog_test.sh
 create mode 100644 utilities/ovn-dbctl.c
 create mode 100644 utilities/ovn-dbctl.h
 delete mode 100644 utilities/ovn-sbctl.8.in
 create mode 100644 utilities/ovn-sbctl.8.xml

-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 04/26] ovn-nbctl: Recommend ovn-appctl instead of ovs-appctl.

2021-04-01 Thread Ben Pfaff
They do exactly the same thing but ovn-appctl has more ovn in its name.

Signed-off-by: Ben Pfaff 
Suggested-by: Mark Michelson 
---
 utilities/ovn-nbctl.8.xml | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/utilities/ovn-nbctl.8.xml b/utilities/ovn-nbctl.8.xml
index 39f9381fdaae..73a0ee25fceb 100644
--- a/utilities/ovn-nbctl.8.xml
+++ b/utilities/ovn-nbctl.8.xml
@@ -114,7 +114,7 @@
 
 
   Daemon mode is internally implemented using the same mechanism used by
-  ovs-appctl.  One may also use ovs-appctl
+  ovn-appctl.  One may also use ovn-appctl
   directly with the following commands:
 
 
@@ -1584,6 +1584,7 @@
 
 
 See Also
-ovn-nb(5).
+ovn-nb(5),
+ovn-appctl(8).
 
 
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 02/26] ovn-nbctl: Fix memory leak in client mode.

2021-04-01 Thread Ben Pfaff
This isn't notable, since this commit frees it just before exiting, but
it cleans up the Address Sanitizer report.

Signed-off-by: Ben Pfaff 
---
 utilities/ovn-nbctl.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c
index 18405835699d..38a4cc7dbd57 100644
--- a/utilities/ovn-nbctl.c
+++ b/utilities/ovn-nbctl.c
@@ -7096,5 +7096,9 @@ nbctl_client(const char *socket_name,
 free(cmd_result);
 free(cmd_error);
 jsonrpc_close(client);
+for (int i = 0; i < argc; i++) {
+free(argv[i]);
+}
+free(argv);
 exit(exit_status);
 }
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 01/26] ovn-northd-ddlog: Fix two memory leaks.

2021-04-01 Thread Ben Pfaff
I get a clean report from Address Sanitizer now.

Signed-off-by: Ben Pfaff 
---
 northd/ovn-northd-ddlog.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/northd/ovn-northd-ddlog.c b/northd/ovn-northd-ddlog.c
index b7d2c8a5ef8d..ca1ab325448c 100644
--- a/northd/ovn-northd-ddlog.c
+++ b/northd/ovn-northd-ddlog.c
@@ -336,6 +336,7 @@ ddlog_commit(ddlog_prog ddlog, ddlog_delta *delta)
 
 /* Merge changes into `delta`. */
 ddlog_delta_union(delta, new_delta);
+ddlog_free_delta(new_delta);
 
 return 0;
 }
@@ -1213,6 +1214,7 @@ main(int argc, char *argv[])
 
 char *ovn_internal_version = ovn_get_internal_version();
 VLOG_INFO("OVN internal version is : [%s]", ovn_internal_version);
+free(ovn_internal_version);
 
 daemonize_complete();
 
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH RESEND ovs v3 4/4] dpif: Don't set "burst_size" to "rate" if not specified.

2021-04-01 Thread Ilya Maximets
On 3/31/21 12:29 AM, Jean Tourrilhes wrote:
> On Tue, Mar 30, 2021 at 02:27:11PM -0700, Ben Pfaff wrote:
>> On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
>>>
>>> OpenFlow spec is a bit loose in definition of what should
>>> be behavior if burst is not set:
>>> """
>>> If the flag OFPMF_BURST is not set the burst_size values from meter
>>> bands are ignored, and if the meter implementation uses a burst value,
>>> this burst value must be set to an implementation defined optimal value.
>>> """
>>>
>>> In our case, historically, "implementation defined optimal value" was
>>> value equal to rate.  I have no idea why, but it's hard to argue with
>>> it since the spec gives a great freedom to choose.
>>>
>>> Actually, the "burst" itself as a term makes very little sense to me.
>>> It's defined by the spec as:
>>> """
>>> It defines the granularity of the meter band, for all packet or byte
>>> bursts whose length is greater than burst value, the meter rate will
>>> always be strictly enforced.
>>> """
>>>
>>> But what is the burst?  How the implementation should define which
>>> packets are in the burst and which are from the next one?
>>>
>>> Current implementation just assumes that bursts are measured per second.
>>> But the rate is measured per second too.  So, burst and rate is
>>> essentially the same thing and implementations just sums them together
>>> to get the bucket size.  So, I do not understand why "burst" and
>>> "burst_size" exist at all.  Why not just set the rate a bit higher?
>>>
>>> Ben, can you shed some light on this?  What was the original idea
>>> behind the meter burst?  Or maybe I'm missing something?
> 
>   I don't understand how you can confuse a rate and a size. The
> OpenFlow spec clearly says it's in kilobits or packets (not per
> seconds).
>   A basic token bucket has only two parameters, the commited
> rate and the burst size (i.e. maximum number of tokens in the
> bucket). The spec reflect that in a generic way to avoid mandating an
> implementation.

Thanks, Jean.

My problem, actually, was that I started from the implementation in the
kernel datapath and it looked super weird.  Especially, this part:
  https://elixir.bootlin.com/linux/latest/source/net/openvswitch/meter.c#L644
Than I tried to find the truth inside the spec, but it doesn't define
the implementation on purpose.  So the only option I had is to guess
how this suppose to work.  Was it 11pm or something else, but my guessing
engine didn't came up with anything that might make sense. :)

>   Burst rate is only defined for more fancy rate limiters, such
> as two colors rate limiters. In this case, you also have two burst
> size, one for each token bucket. The OpenFlow spec does not support
> those extra parameters (as of version 1.5.1).
>   For Linux 'police' filter : rate == rate ; burst_size == burst
>   For Linux 'htb' qdisc : rate == rate ; burst_size == burst ;
> ceil and cburst are not supported.

This totally makes sense.  OTOH, Implementation inside both datapaths
doesn't.

Thanks for pushing me in right direction.

For the implementations: I think, they needs to be reworked.
At least, we need to get rid of 'rate' in a calculation of a maximum
bucket size.  It should not depend on rate, only on a burst size.
i.e. instead of:
  max_bucket_size = (band->burst_size + band->rate) * 1000;
there should be:
  max_bucket_size = band->burst_size * 1000;

This way implementations will have at least a bit of sense.

Summing burst size with rate is like summing apples with oranges.
And that is what misled me.

About having a value for a burst size being numerically equal to the
configured rate:  this looks like some kind of estimated value, but
it's really hard to argue with it, because research is needed to
define the "good value for most cases".

Anyway, we need to fix calculation of a maximum bucket size first.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH RESEND ovs v3 4/4] dpif: Don't set "burst_size" to "rate" if not specified.

2021-04-01 Thread Jean Tourrilhes
On Thu, Apr 01, 2021 at 07:53:25PM +0800, Tonghao Zhang wrote:
> 
> Hi Ben, Ilya
> Try to explain this patch again. Now OvS has supported the burst_size,
>  as one user case,
> if users don't use the burst_size feature, we should set burst_size to
> rate or 0. This patch set this to 0.

'0' is definitely not an "implementation defined optimal
value" as the spec requires. Actually, most token buckets
implementations do not work or work very poorly with a bucket size of
zero.
The first hit on a Google search :
https://www.juniper.net/documentation/us/en/software/junos/routing-policy/topics/concept/policer-mx-m120-m320-burstsize-determining.html
I don't fully agree with their recommendations, 5ms is way too
large for high speed networks, but they expose the problem properly.

> As Ilya said, we should check the OFPMF13_BURST in userspace datapath,
> I think it's right.

I don't have enough context to comment on the patch and
I don't know where that flag should be tested. It would seem that all
datapaths will suffer from the same issue, you need to figure out an
optimal burst_size for user space *and* for kernel space if none is
given via OpenFlow.
I have a strong suspicion that the patch does not do a great
job if OFPMF13_BURST is not set, but I may be wrong.

Regards,

Jean
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4] netlink: ignore IFLA_WIRELESS events

2021-04-01 Thread Ilya Maximets
On 3/4/21 4:32 PM, Michal Kazior wrote:
> From: Michal Kazior 
> 
> Some older wireless drivers - ones relying on the
> old and long deprecated wireless extension ioctl
> system - can generate quite a bit of IFLA_WIRELESS
> events depending on their configuration and
> runtime conditions. These are delivered as
> RTNLGRP_LINK via RTM_NEWLINK messages.
> 
> These tend to be relatively easily identifiable
> because they report the change mask being 0. This
> isn't guaranteed but in practice it shouldn't be a
> problem. None of the wireless events that I ever
> observed actually carry any unique information
> about netdev states that ovs-vswitchd is
> interested in. Hence ignoring these shouldn't
> cause any problems.
> 
> These events can be responsible for a significant
> CPU churn as ovs-vswitchd attempts to do plenty of
> work for each and every netlink message regardless
> of what that message carries. On low-end devices
> such as consumer-grade routers these can lead to a
> lot of CPU cycles being wasted, adding up to heat
> output and reducing performance.
> 
> It could be argued that wireless drivers in
> question should be fixed, but that isn't exactly a
> trivial thing to do. Patching ovs seems far more
> viable while still making sense.
> 
> Signed-off-by: Michal Kazior 
> ---
> 
> Notes:
> v4:
>  - fixed comment-too-long checkpatch warnin [0day robot]
> 
> v3:
>  - dont change rtnetlink_parse() semantics, instead
>extend rtnetlink_change struct and update its
>consumers [Ilya]
>  - adjusted commit log to reflect different approach
>[Ilya]
> 
> v2:
>  - fix bracing style [0day robot / checkpatch]
> 
>  lib/if-notifier.c  |  7 ++-
>  lib/netdev-linux.c |  9 +
>  lib/route-table.c  |  4 
>  lib/rtnetlink.c| 18 ++
>  lib/rtnetlink.h|  3 +++
>  5 files changed, 40 insertions(+), 1 deletion(-)

Hi.  Thanks for the new version!
And sorry again for slow reviews.

> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index 6be23dbee..388288f71 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -663,6 +663,10 @@ netdev_linux_update_lag(struct rtnetlink_change *change)
>  {
>  struct linux_lag_member *lag;
>  
> +if (change->irrelevant) {
> +return;
> +}
> +
>  if (change->sub && netdev_linux_kind_is_lag(change->sub)) {
>  lag = shash_find_data(&lag_shash, change->ifname);
>  
> @@ -887,6 +891,10 @@ netdev_linux_update(struct netdev_linux *dev, int nsid,
>  const struct rtnetlink_change *change)
>  OVS_REQUIRES(dev->mutex)
>  {
> +if (change->irrelevant) {
> +return;
> +}
> +
>  if (netdev_linux_netnsid_is_eq(dev, nsid)) {
>  netdev_linux_update__(dev, change);
>  }

It's unclear why we need to check inside these functions.
I mean, there is only one place where these functions called
and there is no any useful work done there beside calling them.
I think, it's better to just check right after receiving
the change in a same way as in netdev_linux_update_via_netlink().

Something like this:

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index e9ce41d10..ef90fc44c 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -663,10 +663,6 @@ netdev_linux_update_lag(struct rtnetlink_change *change)
 {
 struct linux_lag_member *lag;
 
-if (change->irrelevant) {
-return;
-}
-
 if (change->sub && netdev_linux_kind_is_lag(change->sub)) {
 lag = shash_find_data(&lag_shash, change->ifname);
 
@@ -746,7 +742,7 @@ netdev_linux_run(const struct netdev_class *netdev_class 
OVS_UNUSED)
 if (!error) {
 struct rtnetlink_change change;
 
-if (rtnetlink_parse(&buf, &change)) {
+if (rtnetlink_parse(&buf, &change) && !change->irrelevant) {
 struct netdev *netdev_ = NULL;
 char dev_name[IFNAMSIZ];
 
@@ -891,10 +887,6 @@ netdev_linux_update(struct netdev_linux *dev, int nsid,
 const struct rtnetlink_change *change)
 OVS_REQUIRES(dev->mutex)
 {
-if (change->irrelevant) {
-return;
-}
-
 if (netdev_linux_netnsid_is_eq(dev, nsid)) {
 netdev_linux_update__(dev, change);
 }
---

What do you think?
If it looks good to you, I can squash above diff with your patch and
apply to master.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/5] ipsec: Fix IPv6 support

2021-04-01 Thread Ilya Maximets
On 4/1/21 8:48 PM, Mark Gray wrote:
> On 01/04/2021 19:29, Ilya Maximets wrote:
>> On 4/1/21 3:58 PM, Mark Gray wrote:
>>> This series fixes IPv6 support for Libreswan and introduces
>>> IPsec system tests for Libreswan.
>>>
>>> Mark Gray (5):
>>>   ipsec: IPv6 default route support for Libreswan
>>>   system-common-macros: clean up veth device on test failure
>>>   ipsec: Allow custom file locations
>>>   ipsec: Introduce IPsec system tests for Libreswan
>>>   ipsec: Update ordering of imports
>>>
>>>  ipsec/ovs-monitor-ipsec.in | 144 ++--
>>>  tests/automake.mk  |   3 +-
>>>  tests/system-common-macros.at  |   2 +-
>>>  tests/system-ipsec.at  | 406 +
>>>  tests/system-kmod-testsuite.at |   1 +
>>>  5 files changed, 529 insertions(+), 27 deletions(-)
>>>  create mode 100644 tests/system-ipsec.at
>>>
>>
>> Thanks!  Applied to master.
>> First patch with the fix also backported down to 2.13.
>>
>> Best regards, Ilya Maximets.
>>
> 
> Thanks Ilya.
> 
> Flavio was having some issues with the system test patch which appears
> to have a flake. We have not received his Ack yet. I will follow-up with
> a fix if it doesn't work for him or you can revert.
> 

I didn't notice any issues while running these tests on
my system.

Anyway, system testsuite has more flaky and even broken
tests, so it's not a big issue.  I'd like to accept the
fix when it's ready, though.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [OVN Patch v16 2/3] ovn-northd: Introduce parallel lflow build

2021-04-01 Thread Numan Siddique
On Tue, Mar 30, 2021 at 2:47 PM  wrote:
>
> From: Anton Ivanov 
>
> Datapaths, ports, igmp groups and load balancers can now
> be iterated over in parallel in order to speed up the lflow
> generation. This decreases the time needed to generate the
> logical flows by a factor of 4+ on a 6 core/12 thread CPU
> without datapath groups - from 0.8-1 microseconds per flow
> down to 0.2-0.3 microseconds per flow on average.
>
> The decrease in time to compute lflows with datapath groups
> enabled is ~2 times for the same hardware - from an average of
> 2.4 microseconds per flow to 1.2 microseconds per flow.
>
> Tested for on an 8 node, 400 pod K8 simulation resulting
> in > 6K flows.
>
> Signed-off-by: Anton Ivanov 

Hi Anton,

I tested on my setup applying the first 2 patches of this series.  I
don't see any crashes
now.  And all the tests pass.  Great !

However, the compilation is failing with clang

***
../northd/ovn-northd.c:7336:25: error: incompatible integer to pointer
conversion passing 'uint64_t' (aka 'unsigned long') to parameter of
type 'uint64_t *' (aka 'unsigned long *') [-Werror,-Wint-conversion]
(uint64_t) &mcast_sw_info->table_size,
^
/home/nusiddiq/workspace_cpp/ovn-org/ovn-for-reviews/ovn/ovs/lib/ovs-atomic-clang.h:57:50:
note: expanded from macro 'atomic_compare_exchange_strong'
atomic_compare_exchange_strong_explicit(DST, EXP, SRC,  \
 ^~~
/home/nusiddiq/workspace_cpp/ovn-org/ovn-for-reviews/ovn/ovs/lib/ovs-atomic-clang.h:61:47:
note: expanded from macro 'atomic_compare_exchange_strong_explicit'
__c11_atomic_compare_exchange_strong(DST, EXP, SRC, ORD1, ORD2)
  ^~~
../northd/ovn-northd.c:7352:25: error: passing 'int64_t *' (aka 'long
*') to parameter of type 'uint64_t *' (aka 'unsigned long *') converts
between pointers to integer types with different sign
[-Werror,-Wpointer-sign]
&mcast_sw_info->table_size,
^~
/home/nusiddiq/workspace_cpp/ovn-org/ovn-for-reviews/ovn/ovs/lib/ovs-atomic-clang.h:57:50:
note: expanded from macro 'atomic_compare_exchange_strong'
atomic_compare_exchange_strong_explicit(DST, EXP, SRC,  \
 ^~~
/home/nusiddiq/workspace_cpp/ovn-org/ovn-for-reviews/ovn/ovs/lib/ovs-atomic-clang.h:61:47:
note: expanded from macro 'atomic_compare_exchange_strong_explicit'
__c11_atomic_compare_exchange_strong(DST, EXP, SRC, ORD1, ORD2)

**

I fixed it manually by casting to (uint64_t *) and all the tests passed for me.

Thanks
Numan

> ---
>  northd/ovn-northd.c | 363 
>  1 file changed, 301 insertions(+), 62 deletions(-)
>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 57df62b92..eb5cbf832 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -39,6 +39,7 @@
>  #include "lib/ovn-util.h"
>  #include "lib/lb.h"
>  #include "memory.h"
> +#include "lib/ovn-parallel-hmap.h"
>  #include "ovn/actions.h"
>  #include "ovn/features.h"
>  #include "ovn/logical-fields.h"
> @@ -539,10 +540,10 @@ struct mcast_switch_info {
>   * be received for queries that were sent 
> out.
>   */
>
> -uint32_t active_v4_flows;   /* Current number of active IPv4 multicast
> +atomic_uint64_t active_v4_flows;   /* Current number of active IPv4 
> multicast
>   * flows.
>   */
> -uint32_t active_v6_flows;   /* Current number of active IPv6 multicast
> +atomic_uint64_t active_v6_flows;   /* Current number of active IPv6 
> multicast
>   * flows.
>   */
>  };
> @@ -1001,8 +1002,8 @@ init_mcast_info_for_switch_datapath(struct ovn_datapath 
> *od)
>  smap_get_ullong(&od->nbs->other_config, "mcast_query_max_response",
>  OVN_MCAST_DEFAULT_QUERY_MAX_RESPONSE_S);
>
> -mcast_sw_info->active_v4_flows = 0;
> -mcast_sw_info->active_v6_flows = 0;
> +mcast_sw_info->active_v4_flows = ATOMIC_VAR_INIT(0);
> +mcast_sw_info->active_v6_flows = ATOMIC_VAR_INIT(0);
>  }
>
>  static void
> @@ -4067,6 +4068,34 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct 
> ovn_datapath *od,
>  /* If this option is 'true' northd will combine logical flows that differ by
>   * logical datapath only by creating a datapath group. */
>  static bool use_logical_dp_groups = false;
> +static bool use_parallel_build = true;
> +
> +static struct hashrow_locks lflow_locks;
> +
> +/* Adds a row with the specified contents to the Logical_Flow table.
> + * Version to use when locking is required.
> + */
> +static void
> +do_ovn_lflow_add(struct hmap *lflow_map, bool shared,
> +struct ovn

Re: [ovs-dev] [PATCH v3 0/5] ipsec: Fix IPv6 support

2021-04-01 Thread Mark Gray
On 01/04/2021 19:29, Ilya Maximets wrote:
> On 4/1/21 3:58 PM, Mark Gray wrote:
>> This series fixes IPv6 support for Libreswan and introduces
>> IPsec system tests for Libreswan.
>>
>> Mark Gray (5):
>>   ipsec: IPv6 default route support for Libreswan
>>   system-common-macros: clean up veth device on test failure
>>   ipsec: Allow custom file locations
>>   ipsec: Introduce IPsec system tests for Libreswan
>>   ipsec: Update ordering of imports
>>
>>  ipsec/ovs-monitor-ipsec.in | 144 ++--
>>  tests/automake.mk  |   3 +-
>>  tests/system-common-macros.at  |   2 +-
>>  tests/system-ipsec.at  | 406 +
>>  tests/system-kmod-testsuite.at |   1 +
>>  5 files changed, 529 insertions(+), 27 deletions(-)
>>  create mode 100644 tests/system-ipsec.at
>>
> 
> Thanks!  Applied to master.
> First patch with the fix also backported down to 2.13.
> 
> Best regards, Ilya Maximets.
> 

Thanks Ilya.

Flavio was having some issues with the system test patch which appears
to have a flake. We have not received his Ack yet. I will follow-up with
a fix if it doesn't work for him or you can revert.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 2/2] Datapath: New MPLS actions for layer 2 tunnelling.

2021-04-01 Thread Ilya Maximets
On 3/26/21 7:21 AM, Martin Varghese wrote:
> From: Martin Varghese 
> 
> Upstream commit:
> 
> commit f66b53fdbb22ced1a323b22b9de84a61aacd8d18
> Author: Martin Varghese 
> Date:   Sat Dec 21 08:50:46 2019 +0530
> 
> openvswitch: New MPLS actions for layer 2 tunnelling
> The existing PUSH MPLS action inserts MPLS header between ethernet
> header and the IP header. Though this behaviour is fine for L3 VPN
> where an IP  packet is encapsulated inside a MPLS tunnel, it does not
> suffice the L2 VPN (l2 tunnelling) requirements. In L2 VPN the MPLS
> header should encapsulate the ethernet packet.
> 
> The new mpls action ADD_MPLS inserts MPLS header at the start of the
> packet or at the start of the l3 header depending on the value of l3
> tunnel flag in the ADD_MPLS arguments.
> 
> POP_MPLS action is extended to support ethertype 0x6558.
> 
> Signed-off-by: Martin Varghese 
> Acked-by: Pravin B Shelar 
> Signed-off-by: David S. Miller 
> 
> Signed-off-by: Martin Varghese 
> ---
>  datapath/actions.c  | 42 -
>  datapath/flow_netlink.c | 33 
>  2 files changed, 66 insertions(+), 9 deletions(-)
> 

The kernel module that is in OVS source tree is deprecated
and we're not accepting new features there.  So, you may
drop this path from the set.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/5] ipsec: Fix IPv6 support

2021-04-01 Thread Ilya Maximets
On 4/1/21 3:58 PM, Mark Gray wrote:
> This series fixes IPv6 support for Libreswan and introduces
> IPsec system tests for Libreswan.
> 
> Mark Gray (5):
>   ipsec: IPv6 default route support for Libreswan
>   system-common-macros: clean up veth device on test failure
>   ipsec: Allow custom file locations
>   ipsec: Introduce IPsec system tests for Libreswan
>   ipsec: Update ordering of imports
> 
>  ipsec/ovs-monitor-ipsec.in | 144 ++--
>  tests/automake.mk  |   3 +-
>  tests/system-common-macros.at  |   2 +-
>  tests/system-ipsec.at  | 406 +
>  tests/system-kmod-testsuite.at |   1 +
>  5 files changed, 529 insertions(+), 27 deletions(-)
>  create mode 100644 tests/system-ipsec.at
> 

Thanks!  Applied to master.
First patch with the fix also backported down to 2.13.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5] binding: Fix the crashes seen when port binding type changes.

2021-04-01 Thread Numan Siddique
On Thu, Apr 1, 2021 at 11:47 PM Dumitru Ceara  wrote:
>
> On 4/1/21 8:02 PM, Numan Siddique wrote:
> > On Wed, Mar 31, 2021 at 5:03 PM Dumitru Ceara  wrote:
> >>
> >> On 3/29/21 3:21 PM, num...@ovn.org wrote:
> >>> From: Numan Siddique 
> >>>
> >>> When a port binding type changes from type 'A' to type 'B', then
> >>> there are many code paths in the existing binding.c which results
> >>> in crashes due to use-after-free or NULL references.
> >>>
> >>> Below crashes are seen when a container lport is changed to a normal
> >>> lport and then deleted.
> >>>
> >>> ***
> >>>  (gdb) bt
> >>>0  in raise () from /lib64/libc.so.6
> >>>1  in abort () from /lib64/libc.so.6
> >>>2  ovs_abort_valist ("%s: assertion %s failed in %s()") at 
> >>> lib/util.c:419
> >>>3  vlog_abort_valist ("%s: assertion %s failed in %s()") at 
> >>> lib/vlog.c:1249
> >>>4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
> >>>5  ovs_assert_failure (where="lib/ovsdb-idl.c:4653",
> >>>   function="ovsdb_idl_txn_write__",
> >>>   condition="row->new_datum != NULL") at 
> >>> lib/util.c:86
> >>>6  ovsdb_idl_txn_write__ () at lib/ovsdb-idl.c:4695
> >>>7  ovsdb_idl_txn_write_clone () at lib/ovsdb-idl.c:4757
> >>>8  sbrec_port_binding_set_chassis () at lib/ovn-sb-idl.c:25946
> >>>9  release_lport () at controller/binding.c:971
> >>>   10  release_local_binding_children () at controller/binding.c:1039
> >>>   11  release_local_binding () at controller/binding.c:1056
> >>>   12  consider_iface_release (iface_rec=.. 
> >>> iface_id="bb43e818-b2ee-4329-b67e-218556580056") at 
> >>> controller/binding.c:1880
> >>>   13  binding_handle_ovs_interface_changes () at controller/binding.c:1998
> >>>   14  runtime_data_ovs_interface_handler () at 
> >>> controller/ovn-controller.c:1481
> >>>   15  engine_compute () at lib/inc-proc-eng.c:306
> >>>   16  engine_run_node () at lib/inc-proc-eng.c:352
> >>>   17  engine_run () at lib/inc-proc-eng.c:377
> >>>   18  main () at controller/ovn-controller.c:2826
> >>>
> >>> The present code creates a 'struct local_binding' instance for a
> >>> container lport and adds this object to the parent local binding
> >>> children list.  And if the container lport is changed to a normal
> >>> vif, then there is no way to access the local binding object created
> >>> earlier.  This patch fixes these type of issues by refactoring the
> >>> 'local binding' code of binding.c.  This patch now creates only one
> >>> instance of 'struct local_binding' for every OVS interface with
> >>> external_ids:iface-id set.  A new structure 'struct binding_lport' is
> >>> added which is created for a VIF, container and virtual port bindings
> >>> and is stored in 'binding_lports' shash.  'struct local_binding' now
> >>> maintains a list of binding_lports which it maps to.
> >>>
> >>> When a container lport is changed to a normal lport, we now can
> >>> easily access the 'binding_lport' object of the container lport
> >>> fron the 'binding_lports' shash.
> >>>
> >>> A new debug unixctl command is added - debug/dump-local-bindings,
> >>> which dumps the local bindings stored by the ovn-controller.  This
> >>> command is also used in the test cases to validate that ovn-controller
> >>> maintains proper local bindings.
> >>>
> >>> Reported-by: Dumitru Ceara 
> >>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936328
> >>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936331
> >>>
> >>> Co-authored-by: Dumitru Ceara 
> >>> [dce...@redhat.com contributed to the test cases which helped in 
> >>> reproducing the crashes.]
> >>> Signed-off-by: Dumitru Ceara 
> >>> Signed-off-by: Numan Siddique 
> >>> ---
> >>
> >> Hi Numan,
> >>
> >> I think the changes are OK, I hope we didn't miss any cases though.  As
> >> Mark mentioned on an older revision, the additional tests do give a
> >> certain level of confidence.
> >>
> >> I'd recommend removing the "co-authored-by" tag above as you've been
> >> adding and improving the tests a lot since we first hit this issue and
> >> since v1 was posted.
> >>
> >> That would also allow me to:
> >>
> >> Acked-by: Dumitru Ceara 
> >>
> >> I do have some minor nits/comments below, please have a look at them
> >> before pushing this change.
> >>
> >
> > Thanks for the review and for the comments.
> >
> > I applied this patch to master addressing all your comments.  Below is the 
> > diff
> > of changes as suggested by you on top of this patch.
> >
> > In case I've missed addressing your comment  due to my negligence, do
> > let me know.
>
> Thanks for taking care of this, the diff looks good to me.

Great.  Thanks.  I'm working on applying this patch to branch-21.03. I
will restrict
to branch-21.03 for now.

Regards
Numan

>
> Regards,
> Dumitru
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
__

Re: [ovs-dev] [PATCH ovn v5] binding: Fix the crashes seen when port binding type changes.

2021-04-01 Thread Dumitru Ceara
On 4/1/21 8:02 PM, Numan Siddique wrote:
> On Wed, Mar 31, 2021 at 5:03 PM Dumitru Ceara  wrote:
>>
>> On 3/29/21 3:21 PM, num...@ovn.org wrote:
>>> From: Numan Siddique 
>>>
>>> When a port binding type changes from type 'A' to type 'B', then
>>> there are many code paths in the existing binding.c which results
>>> in crashes due to use-after-free or NULL references.
>>>
>>> Below crashes are seen when a container lport is changed to a normal
>>> lport and then deleted.
>>>
>>> ***
>>>  (gdb) bt
>>>0  in raise () from /lib64/libc.so.6
>>>1  in abort () from /lib64/libc.so.6
>>>2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
>>>3  vlog_abort_valist ("%s: assertion %s failed in %s()") at 
>>> lib/vlog.c:1249
>>>4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
>>>5  ovs_assert_failure (where="lib/ovsdb-idl.c:4653",
>>>   function="ovsdb_idl_txn_write__",
>>>   condition="row->new_datum != NULL") at 
>>> lib/util.c:86
>>>6  ovsdb_idl_txn_write__ () at lib/ovsdb-idl.c:4695
>>>7  ovsdb_idl_txn_write_clone () at lib/ovsdb-idl.c:4757
>>>8  sbrec_port_binding_set_chassis () at lib/ovn-sb-idl.c:25946
>>>9  release_lport () at controller/binding.c:971
>>>   10  release_local_binding_children () at controller/binding.c:1039
>>>   11  release_local_binding () at controller/binding.c:1056
>>>   12  consider_iface_release (iface_rec=.. 
>>> iface_id="bb43e818-b2ee-4329-b67e-218556580056") at 
>>> controller/binding.c:1880
>>>   13  binding_handle_ovs_interface_changes () at controller/binding.c:1998
>>>   14  runtime_data_ovs_interface_handler () at 
>>> controller/ovn-controller.c:1481
>>>   15  engine_compute () at lib/inc-proc-eng.c:306
>>>   16  engine_run_node () at lib/inc-proc-eng.c:352
>>>   17  engine_run () at lib/inc-proc-eng.c:377
>>>   18  main () at controller/ovn-controller.c:2826
>>>
>>> The present code creates a 'struct local_binding' instance for a
>>> container lport and adds this object to the parent local binding
>>> children list.  And if the container lport is changed to a normal
>>> vif, then there is no way to access the local binding object created
>>> earlier.  This patch fixes these type of issues by refactoring the
>>> 'local binding' code of binding.c.  This patch now creates only one
>>> instance of 'struct local_binding' for every OVS interface with
>>> external_ids:iface-id set.  A new structure 'struct binding_lport' is
>>> added which is created for a VIF, container and virtual port bindings
>>> and is stored in 'binding_lports' shash.  'struct local_binding' now
>>> maintains a list of binding_lports which it maps to.
>>>
>>> When a container lport is changed to a normal lport, we now can
>>> easily access the 'binding_lport' object of the container lport
>>> fron the 'binding_lports' shash.
>>>
>>> A new debug unixctl command is added - debug/dump-local-bindings,
>>> which dumps the local bindings stored by the ovn-controller.  This
>>> command is also used in the test cases to validate that ovn-controller
>>> maintains proper local bindings.
>>>
>>> Reported-by: Dumitru Ceara 
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936328
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936331
>>>
>>> Co-authored-by: Dumitru Ceara 
>>> [dce...@redhat.com contributed to the test cases which helped in 
>>> reproducing the crashes.]
>>> Signed-off-by: Dumitru Ceara 
>>> Signed-off-by: Numan Siddique 
>>> ---
>>
>> Hi Numan,
>>
>> I think the changes are OK, I hope we didn't miss any cases though.  As
>> Mark mentioned on an older revision, the additional tests do give a
>> certain level of confidence.
>>
>> I'd recommend removing the "co-authored-by" tag above as you've been
>> adding and improving the tests a lot since we first hit this issue and
>> since v1 was posted.
>>
>> That would also allow me to:
>>
>> Acked-by: Dumitru Ceara 
>>
>> I do have some minor nits/comments below, please have a look at them
>> before pushing this change.
>>
> 
> Thanks for the review and for the comments.
> 
> I applied this patch to master addressing all your comments.  Below is the 
> diff
> of changes as suggested by you on top of this patch.
> 
> In case I've missed addressing your comment  due to my negligence, do
> let me know.

Thanks for taking care of this, the diff looks good to me.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5] binding: Fix the crashes seen when port binding type changes.

2021-04-01 Thread Numan Siddique
On Wed, Mar 31, 2021 at 5:03 PM Dumitru Ceara  wrote:
>
> On 3/29/21 3:21 PM, num...@ovn.org wrote:
> > From: Numan Siddique 
> >
> > When a port binding type changes from type 'A' to type 'B', then
> > there are many code paths in the existing binding.c which results
> > in crashes due to use-after-free or NULL references.
> >
> > Below crashes are seen when a container lport is changed to a normal
> > lport and then deleted.
> >
> > ***
> >  (gdb) bt
> >0  in raise () from /lib64/libc.so.6
> >1  in abort () from /lib64/libc.so.6
> >2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
> >3  vlog_abort_valist ("%s: assertion %s failed in %s()") at 
> > lib/vlog.c:1249
> >4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
> >5  ovs_assert_failure (where="lib/ovsdb-idl.c:4653",
> >   function="ovsdb_idl_txn_write__",
> >   condition="row->new_datum != NULL") at 
> > lib/util.c:86
> >6  ovsdb_idl_txn_write__ () at lib/ovsdb-idl.c:4695
> >7  ovsdb_idl_txn_write_clone () at lib/ovsdb-idl.c:4757
> >8  sbrec_port_binding_set_chassis () at lib/ovn-sb-idl.c:25946
> >9  release_lport () at controller/binding.c:971
> >   10  release_local_binding_children () at controller/binding.c:1039
> >   11  release_local_binding () at controller/binding.c:1056
> >   12  consider_iface_release (iface_rec=.. 
> > iface_id="bb43e818-b2ee-4329-b67e-218556580056") at 
> > controller/binding.c:1880
> >   13  binding_handle_ovs_interface_changes () at controller/binding.c:1998
> >   14  runtime_data_ovs_interface_handler () at 
> > controller/ovn-controller.c:1481
> >   15  engine_compute () at lib/inc-proc-eng.c:306
> >   16  engine_run_node () at lib/inc-proc-eng.c:352
> >   17  engine_run () at lib/inc-proc-eng.c:377
> >   18  main () at controller/ovn-controller.c:2826
> >
> > The present code creates a 'struct local_binding' instance for a
> > container lport and adds this object to the parent local binding
> > children list.  And if the container lport is changed to a normal
> > vif, then there is no way to access the local binding object created
> > earlier.  This patch fixes these type of issues by refactoring the
> > 'local binding' code of binding.c.  This patch now creates only one
> > instance of 'struct local_binding' for every OVS interface with
> > external_ids:iface-id set.  A new structure 'struct binding_lport' is
> > added which is created for a VIF, container and virtual port bindings
> > and is stored in 'binding_lports' shash.  'struct local_binding' now
> > maintains a list of binding_lports which it maps to.
> >
> > When a container lport is changed to a normal lport, we now can
> > easily access the 'binding_lport' object of the container lport
> > fron the 'binding_lports' shash.
> >
> > A new debug unixctl command is added - debug/dump-local-bindings,
> > which dumps the local bindings stored by the ovn-controller.  This
> > command is also used in the test cases to validate that ovn-controller
> > maintains proper local bindings.
> >
> > Reported-by: Dumitru Ceara 
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936328
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936331
> >
> > Co-authored-by: Dumitru Ceara 
> > [dce...@redhat.com contributed to the test cases which helped in 
> > reproducing the crashes.]
> > Signed-off-by: Dumitru Ceara 
> > Signed-off-by: Numan Siddique 
> > ---
>
> Hi Numan,
>
> I think the changes are OK, I hope we didn't miss any cases though.  As
> Mark mentioned on an older revision, the additional tests do give a
> certain level of confidence.
>
> I'd recommend removing the "co-authored-by" tag above as you've been
> adding and improving the tests a lot since we first hit this issue and
> since v1 was posted.
>
> That would also allow me to:
>
> Acked-by: Dumitru Ceara 
>
> I do have some minor nits/comments below, please have a look at them
> before pushing this change.
>

Thanks for the review and for the comments.

I applied this patch to master addressing all your comments.  Below is the diff
of changes as suggested by you on top of this patch.

In case I've missed addressing your comment  due to my negligence, do
let me know.




diff --git a/controller/binding.c b/controller/binding.c
index b4dc6982c9..5bdef295cd 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -744,13 +744,12 @@ binding_dump_local_bindings(struct
local_binding_data *lbinding_data,
 struct ds *out_data)
 {
 const struct shash_node **nodes;
-size_t i, n;

 nodes = shash_sort(&lbinding_data->bindings);
-n = shash_count(&lbinding_data->bindings);
+size_t n = shash_count(&lbinding_data->bindings);

 ds_put_cstr(out_data, "Local bindings:\n");
-for (i = 0; i < n; i++) {
+for (size_t i = 0; i < n; i++) {
 const struct shash_node *node = nodes[i];
 str

[ovs-dev] [PATCH v2 ovn] Support 802.11ad EthType for localnet ports

2021-04-01 Thread Ihar Hrachyshka
In some environments, hardware serving the fabric network doesn't
support double 802.1q (0x8100) VLAN tags, but does support 802.11ad
(0x8a88) EthType for two layer VLAN traffic. Specifically, Cisco
hardware UCS VIC was identified affected by this limitation.

With vlan-passthru=true set for a logical switch, VLAN tagged traffic
may be generated by VIFs. This patch allows to support the feature in
affected hardware environments.

Signed-off-by: Ihar Hrachyshka 

---

v1: initial version.
v2: fixed test scenario to actually validate packets sent by vifs.
v2: stylistic (spacing) change for a ternary operator.
---
 NEWS  |  1 +
 controller/physical.c | 34 +
 ovn-nb.xml|  6 +++
 tests/ovn.at  | 85 +++
 4 files changed, 118 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index 530c5d42f..9e45837c7 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,7 @@ Post-v21.03.0
 (This may take testing and tuning to be effective.)  This version of OVN
 requires DDLog 0.36.
   - Introduce ovn-controller incremetal processing engine statistics
+  - Support custom (802.11ad, 0x8a88) EthType for localnet ports.
 
 OVN v21.03.0 - 12 Mar 2021
 -
diff --git a/controller/physical.c b/controller/physical.c
index fa5d0d692..76e4f302e 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -609,6 +609,28 @@ put_replace_chassis_mac_flows(const struct simap *ct_zones,
 }
 }
 
+static void
+ofpact_put_push_vlan(struct ofpbuf *ofpacts, const struct smap *options, int 
tag)
+{
+const char *ethtype_opt = NULL;
+if (options) {
+ethtype_opt = smap_get(options, "ethtype");
+}
+
+int ethtype;
+if (!ethtype_opt || !str_to_int(ethtype_opt, 16, ðtype)) {
+ethtype = 0x8100;
+}
+struct ofpact_push_vlan *push_vlan;
+push_vlan = ofpact_put_PUSH_VLAN(ofpacts);
+push_vlan->ethertype = htons(ethtype);
+
+struct ofpact_vlan_vid *vlan_vid;
+vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts);
+vlan_vid->vlan_vid = tag;
+vlan_vid->push_vlan_if_needed = false;
+}
+
 static void
 put_replace_router_port_mac_flows(struct ovsdb_idl_index
   *sbrec_port_binding_by_name,
@@ -696,10 +718,7 @@ put_replace_router_port_mac_flows(struct ovsdb_idl_index
 replace_mac->mac = chassis_mac;
 
 if (tag) {
-struct ofpact_vlan_vid *vlan_vid;
-vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p);
-vlan_vid->vlan_vid = tag;
-vlan_vid->push_vlan_if_needed = true;
+ofpact_put_push_vlan(ofpacts_p, &localnet_port->options, tag);
 }
 
 ofpact_put_OUTPUT(ofpacts_p)->port = ofport;
@@ -1195,10 +1214,9 @@ consider_port_binding(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 if (tag) {
 /* For containers sitting behind a local vif, tag the packets
  * before delivering them. */
-struct ofpact_vlan_vid *vlan_vid;
-vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p);
-vlan_vid->vlan_vid = tag;
-vlan_vid->push_vlan_if_needed = true;
+ofpact_put_push_vlan(
+ofpacts_p, localnet_port ? &localnet_port->options : NULL,
+tag);
 }
 ofpact_put_OUTPUT(ofpacts_p)->port = ofport;
 if (tag) {
diff --git a/ovn-nb.xml b/ovn-nb.xml
index b0a4adffe..fca22988b 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -830,6 +830,12 @@
   uses its local configuration to determine exactly how to connect to
   this locally accessible network, if at all.
 
+
+
+  Optional. VLAN EtherType field value for encapsulating VLAN
+  headers. Supported values: 8100 (default), 8a88 (QinQ).
+
+
   
 
   
diff --git a/tests/ovn.at b/tests/ovn.at
index 391a8bcd9..225c33610 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -3164,6 +3164,91 @@ OVN_CLEANUP([hv-1],[hv-2])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- VLAN transparency, passthru=true, multiple hosts, custom 
ethtype])
+ovn_start
+
+ethtype=88a8
+
+check ovn-nbctl ls-add ls
+check ovn-nbctl --wait=sb add Logical-Switch ls other_config vlan-passthru=true
+
+ln_port_name=ln-100
+ovn-nbctl lsp-add ls $ln_port_name "" 100
+ovn-nbctl lsp-set-addresses $ln_port_name unknown
+ovn-nbctl lsp-set-type $ln_port_name localnet
+ovn-nbctl lsp-set-options $ln_port_name network_name=phys-100 
ethtype=0x$ethtype
+net_add n-100
+
+# two hypervisors, each connected to the same network
+for i in 1 2; do
+sim_add hv-$i
+as hv-$i
+ovs-vsctl add-br br-phys
+ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys-100:br-phys
+ovn_attach n-100 br-phys 192.168.0.$i
+done
+
+for i in 1 2; do
+check ovn-nbctl lsp-add ls lsp$i
+check ovn-nbctl lsp-set-addresses lsp$i f0:00:00:00:00:0$i
+done
+
+for i in 1 2; do
+as hv-$i
+ovs-vsctl add-port b

Re: [ovs-dev] ovn-northd-ddlog scale issues

2021-04-01 Thread Ben Pfaff
On Thu, Apr 01, 2021 at 06:33:21PM +0200, Dumitru Ceara wrote:
> On 4/1/21 5:48 PM, Ben Pfaff wrote:
> > On Wed, Mar 24, 2021 at 04:03:07PM +0100, Dumitru Ceara wrote:
> >> Hi Ben,
> >>
> >> We discussed a bit about this during one of the recent IRC OVN meetings,
> >> but I didn't get around to properly reporting this until now.
> >>
> >> I've tried running ovn-northd-ddlog against some large OVN NB/DB
> >> databases extracted from one of our scale testing runs:
> >>
> >> http://people.redhat.com/~dceara/ovn-northd-ddlog-tests/20210324/existing-nb-sb/
> >>
> >> It seems that ovn-northd-ddlog gets stuck in a busy loop and uses a lot
> >> of memory:
> >>
> >> 775734 root  10 -10   81.6g  80.8g  22396 S  99.7  64.2   3:50.79 
> >> ovn-northd-ddlog
> > 
> > I am game to try to reproduce and fix this.  I haven't tried reproducing
> > from a database snapshot before, so it'll be a new adventure.
> > 
> > But those files are 403 Forbidden, even though the directory they're in
> > comes up fine.
> > 
> 
> Oops, really sorry about that, my bad.
> 
> Can you, please, try again now?

Thanks!  I can download them now.  It's back on my to-do list.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] Support 802.11ad EthType for localnet ports

2021-04-01 Thread Ihar Hrachyshka
In some environments, hardware serving the fabric network doesn't
support double 802.1q (0x8100) VLAN tags, but does support 802.11ad
(0x8a88) EthType for two layer VLAN traffic. Specifically, Cisco
hardware UCS VIC was identified affected by this limitation.

With vlan-passthru=true set for a logical switch, VLAN tagged traffic
may be generated by VIFs. This patch allows to support the feature in
affected hardware environments.

Signed-off-by: Ihar Hrachyshka 

---

v1: initial version.
v2: fixed test scenario to actually validate packets sent by vifs.
v2: stylistic (spacing) change for a ternary operator.
---
 NEWS  |  1 +
 controller/physical.c | 34 +
 ovn-nb.xml|  6 +++
 tests/ovn.at  | 85 +++
 4 files changed, 118 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index 530c5d42f..9e45837c7 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,7 @@ Post-v21.03.0
 (This may take testing and tuning to be effective.)  This version of OVN
 requires DDLog 0.36.
   - Introduce ovn-controller incremetal processing engine statistics
+  - Support custom (802.11ad, 0x8a88) EthType for localnet ports.
 
 OVN v21.03.0 - 12 Mar 2021
 -
diff --git a/controller/physical.c b/controller/physical.c
index fa5d0d692..76e4f302e 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -609,6 +609,28 @@ put_replace_chassis_mac_flows(const struct simap *ct_zones,
 }
 }
 
+static void
+ofpact_put_push_vlan(struct ofpbuf *ofpacts, const struct smap *options, int 
tag)
+{
+const char *ethtype_opt = NULL;
+if (options) {
+ethtype_opt = smap_get(options, "ethtype");
+}
+
+int ethtype;
+if (!ethtype_opt || !str_to_int(ethtype_opt, 16, ðtype)) {
+ethtype = 0x8100;
+}
+struct ofpact_push_vlan *push_vlan;
+push_vlan = ofpact_put_PUSH_VLAN(ofpacts);
+push_vlan->ethertype = htons(ethtype);
+
+struct ofpact_vlan_vid *vlan_vid;
+vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts);
+vlan_vid->vlan_vid = tag;
+vlan_vid->push_vlan_if_needed = false;
+}
+
 static void
 put_replace_router_port_mac_flows(struct ovsdb_idl_index
   *sbrec_port_binding_by_name,
@@ -696,10 +718,7 @@ put_replace_router_port_mac_flows(struct ovsdb_idl_index
 replace_mac->mac = chassis_mac;
 
 if (tag) {
-struct ofpact_vlan_vid *vlan_vid;
-vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p);
-vlan_vid->vlan_vid = tag;
-vlan_vid->push_vlan_if_needed = true;
+ofpact_put_push_vlan(ofpacts_p, &localnet_port->options, tag);
 }
 
 ofpact_put_OUTPUT(ofpacts_p)->port = ofport;
@@ -1195,10 +1214,9 @@ consider_port_binding(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 if (tag) {
 /* For containers sitting behind a local vif, tag the packets
  * before delivering them. */
-struct ofpact_vlan_vid *vlan_vid;
-vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p);
-vlan_vid->vlan_vid = tag;
-vlan_vid->push_vlan_if_needed = true;
+ofpact_put_push_vlan(
+ofpacts_p, localnet_port ? &localnet_port->options : NULL,
+tag);
 }
 ofpact_put_OUTPUT(ofpacts_p)->port = ofport;
 if (tag) {
diff --git a/ovn-nb.xml b/ovn-nb.xml
index b0a4adffe..fca22988b 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -830,6 +830,12 @@
   uses its local configuration to determine exactly how to connect to
   this locally accessible network, if at all.
 
+
+
+  Optional. VLAN EtherType field value for encapsulating VLAN
+  headers. Supported values: 8100 (default), 8a88 (QinQ).
+
+
   
 
   
diff --git a/tests/ovn.at b/tests/ovn.at
index 391a8bcd9..225c33610 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -3164,6 +3164,91 @@ OVN_CLEANUP([hv-1],[hv-2])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- VLAN transparency, passthru=true, multiple hosts, custom 
ethtype])
+ovn_start
+
+ethtype=88a8
+
+check ovn-nbctl ls-add ls
+check ovn-nbctl --wait=sb add Logical-Switch ls other_config vlan-passthru=true
+
+ln_port_name=ln-100
+ovn-nbctl lsp-add ls $ln_port_name "" 100
+ovn-nbctl lsp-set-addresses $ln_port_name unknown
+ovn-nbctl lsp-set-type $ln_port_name localnet
+ovn-nbctl lsp-set-options $ln_port_name network_name=phys-100 
ethtype=0x$ethtype
+net_add n-100
+
+# two hypervisors, each connected to the same network
+for i in 1 2; do
+sim_add hv-$i
+as hv-$i
+ovs-vsctl add-br br-phys
+ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys-100:br-phys
+ovn_attach n-100 br-phys 192.168.0.$i
+done
+
+for i in 1 2; do
+check ovn-nbctl lsp-add ls lsp$i
+check ovn-nbctl lsp-set-addresses lsp$i f0:00:00:00:00:0$i
+done
+
+for i in 1 2; do
+as hv-$i
+ovs-vsctl add-port b

Re: [ovs-dev] ovn-northd-ddlog scale issues

2021-04-01 Thread Dumitru Ceara
On 4/1/21 5:48 PM, Ben Pfaff wrote:
> On Wed, Mar 24, 2021 at 04:03:07PM +0100, Dumitru Ceara wrote:
>> Hi Ben,
>>
>> We discussed a bit about this during one of the recent IRC OVN meetings,
>> but I didn't get around to properly reporting this until now.
>>
>> I've tried running ovn-northd-ddlog against some large OVN NB/DB
>> databases extracted from one of our scale testing runs:
>>
>> http://people.redhat.com/~dceara/ovn-northd-ddlog-tests/20210324/existing-nb-sb/
>>
>> It seems that ovn-northd-ddlog gets stuck in a busy loop and uses a lot
>> of memory:
>>
>> 775734 root  10 -10   81.6g  80.8g  22396 S  99.7  64.2   3:50.79 
>> ovn-northd-ddlog
> 
> I am game to try to reproduce and fix this.  I haven't tried reproducing
> from a database snapshot before, so it'll be a new adventure.
> 
> But those files are 403 Forbidden, even though the directory they're in
> comes up fine.
> 

Oops, really sorry about that, my bad.

Can you, please, try again now?

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/5] ipsec: IPv6 default route support for Libreswan

2021-04-01 Thread Mark Gray
On 01/04/2021 14:58, Mark Gray wrote:
> When configuring IPsec, "ovs-monitor-ipsec" honours
> the 'local_ip' option in the 'Interface' table by configuring
> the 'left' side of the Libreswan connection with 'local_ip'.
> If 'local_ip' is not specified, "ovs-monitor-ipsec" sets
> 'left' to '%defaultroute' which is interpreted as the IP
> address of the default gateway interface.
> 
> However, when 'remote_ip' is an IPv6 address, Libreswan
> still interprets '%defaultroute' as the IPv4 address on the
> default gateway interface (see:
> https://github.com/libreswan/libreswan/issues/416) giving
> an "address family inconsistency" error.
> 
> This patch resolves this issue by specifying the
> connection as IPv6 when the 'remote_ip' is IPv6 and
> 'local_ip' has not been set.
> 
> Fixes: 22c5eafb6efa ("ipsec: reintroduce IPsec support for tunneling")
> Signed-off-by: Mark Gray 
> Acked-by: Flavio Leitner 
> Acked-by: Aaron Conole 
> Acked-by: Eelco Chaudron 
> ---

FYI, I think this should be back-ported as it fixes a bug in IPsec. The
others update the test framework and probably don't need to be.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn-northd-ddlog scale issues

2021-04-01 Thread Ben Pfaff
On Wed, Mar 24, 2021 at 04:03:07PM +0100, Dumitru Ceara wrote:
> Hi Ben,
> 
> We discussed a bit about this during one of the recent IRC OVN meetings,
> but I didn't get around to properly reporting this until now.
> 
> I've tried running ovn-northd-ddlog against some large OVN NB/DB
> databases extracted from one of our scale testing runs:
> 
> http://people.redhat.com/~dceara/ovn-northd-ddlog-tests/20210324/existing-nb-sb/
> 
> It seems that ovn-northd-ddlog gets stuck in a busy loop and uses a lot
> of memory:
> 
> 775734 root  10 -10   81.6g  80.8g  22396 S  99.7  64.2   3:50.79 
> ovn-northd-ddlog

I am game to try to reproduce and fix this.  I haven't tried reproducing
from a database snapshot before, so it'll be a new adventure.

But those files are 403 Forbidden, even though the directory they're in
comes up fine.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] northd: introduce per-lb lb_skip_snat option

2021-04-01 Thread Lorenzo Bianconi
Inotroduce lb_skip_snat in load balancer option column in order to not
force_snat traffic that is hitting a given load balancer applied on a
logical router where lb_force_snat has been configured

https://bugzilla.redhat.com/show_bug.cgi?id=1927540

Tested-by: Andrew Stoycos 
Signed-off-by: Lorenzo Bianconi 
---
 include/ovn/logical-fields.h |  5 
 lib/logical-fields.c |  4 
 northd/lrouter.dl| 10 ++--
 northd/ovn-northd.8.xml  | 25 +++-
 northd/ovn-northd.c  | 44 
 northd/ovn_northd.dl | 32 +++---
 tests/ovn-northd.at  | 25 +++-
 7 files changed, 119 insertions(+), 26 deletions(-)

diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
index 017176f98..d44b30b30 100644
--- a/include/ovn/logical-fields.h
+++ b/include/ovn/logical-fields.h
@@ -66,6 +66,7 @@ enum mff_log_flags_bits {
 MLF_LOOKUP_MAC_BIT = 6,
 MLF_LOOKUP_LB_HAIRPIN_BIT = 7,
 MLF_LOOKUP_FDB_BIT = 8,
+MLF_SKIP_SNAT_FOR_LB_BIT = 9,
 };
 
 /* MFF_LOG_FLAGS_REG flag assignments */
@@ -102,6 +103,10 @@ enum mff_log_flags {
 
 /* Indicate that the lookup in the fdb table was successful. */
 MLF_LOOKUP_FDB = (1 << MLF_LOOKUP_FDB_BIT),
+
+/* Indicate that a packet must not SNAT in the gateway router when
+ * load-balancing has taken place. */
+MLF_SKIP_SNAT_FOR_LB = (1 << MLF_SKIP_SNAT_FOR_LB_BIT),
 };
 
 /* OVN logical fields
diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index 9d08b44c2..72853013e 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -121,6 +121,10 @@ ovn_init_symtab(struct shash *symtab)
  MLF_FORCE_SNAT_FOR_LB_BIT);
 expr_symtab_add_subfield(symtab, "flags.force_snat_for_lb", NULL,
  flags_str);
+snprintf(flags_str, sizeof flags_str, "flags[%d]",
+ MLF_SKIP_SNAT_FOR_LB_BIT);
+expr_symtab_add_subfield(symtab, "flags.skip_snat_for_lb", NULL,
+ flags_str);
 
 /* Connection tracking state. */
 expr_symtab_add_field_scoped(symtab, "ct_mark", MFF_CT_MARK, NULL, false,
diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index 36cedd2dc..5eb06c77f 100644
--- a/northd/lrouter.dl
+++ b/northd/lrouter.dl
@@ -355,8 +355,14 @@ function lb_force_snat_router_ip(lr_options: Map): bool {
 lr_options.contains_key("chassis")
 }
 
-function force_snat_for_lb(lr: nb::Logical_Router): bool {
-not get_force_snat_ip(lr, "lb").is_empty() or 
lb_force_snat_router_ip(lr.options)
+function snat_for_lb(lr: nb::Logical_Router, lb: Ref): 
bit<8> {
+if (lb.options.get_bool_def("lb_skip_snat", false)) {
+return 2
+};
+if (not get_force_snat_ip(lr, "lb").is_empty() or 
lb_force_snat_router_ip(lr.options)) {
+return 1
+};
+return 0
 }
 
 /* For each router, collect the set of IPv4 and IPv6 addresses used for SNAT,
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index a62f5c057..96ab11892 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -2754,7 +2754,11 @@ icmp6 {
 (and optional port numbers) to load balance to.  If the router is
 configured to force SNAT any load-balanced packets, the above action
 will be replaced by flags.force_snat_for_lb = 1;
-ct_lb(args);. If health check is enabled, then
+ct_lb(args);.
+If the load balancig rule is configured with lb_skip_snat
+set to true, the above action will be replaced by
+flags.skip_snat_for_lb = 1; ct_lb(args);.
+If health check is enabled, then
 args will only contain those endpoints whose service
 monitor status entry in OVN_Southbound db is
 either online or empty.
@@ -2771,6 +2775,9 @@ icmp6 {
 with an action of ct_dnat;. If the router is
 configured to force SNAT any load-balanced packets, the above action
 will be replaced by flags.force_snat_for_lb = 1; ct_dnat;.
+If the load balancig rule is configured with lb_skip_snat
+set to true, the above action will be replaced by
+flags.skip_snat_for_lb = 1; ct_dnat;.
   
 
   
@@ -2785,6 +2792,9 @@ icmp6 {
 to force SNAT any load-balanced packets, the above action will be
 replaced by flags.force_snat_for_lb = 1;
 ct_lb(args);.
+If the load balancig rule is configured with lb_skip_snat
+set to true, the above action will be replaced by
+flags.skip_snat_for_lb = 1; ct_lb(args);.
   
 
   
@@ -2797,6 +2807,9 @@ icmp6 {
 If the router is configured to force SNAT any load-balanced
 packets, the above action will be replaced by
 flags.force_snat_for_lb = 1; ct_dnat;.
+If the load balancig rule is configured with lb_skip_snat
+set to true, the above action will be replaced by
+flags.skip_snat_for_lb = 1; ct_dnat;.
   
 
   

[ovs-dev] [PATCH v3 5/5] ipsec: Update ordering of imports

2021-04-01 Thread Mark Gray
Signed-off-by: Mark Gray 
Acked-by: Flavio Leitner 
Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
---
 ipsec/ovs-monitor-ipsec.in | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in
index a9542477577d..89a36fe17b47 100755
--- a/ipsec/ovs-monitor-ipsec.in
+++ b/ipsec/ovs-monitor-ipsec.in
@@ -14,12 +14,12 @@
 # limitations under the License.
 
 import argparse
+import copy
 import ipaddress
+import os
 import re
 import subprocess
 import sys
-import copy
-import os
 from string import Template
 
 import ovs.daemon
-- 
2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 4/5] ipsec: Introduce IPsec system tests for Libreswan

2021-04-01 Thread Mark Gray
This patch adds system tests for OVS IPsec using Libreswan.
If Libreswan is not present on the system, the tests will
be skipped.

These tests set up an underlay switch with bridge 'br0'
to carry encrypted traffic between two emulated "nodes".
Each "node" is a seperate network namespace ('left' and
'right') and runs an instance of the Libreswan "pluto"
daemon, ovs-monitor-ipsec, ovs-vswitch and ovsdb-server.

Each test sets up IPsec between the two emulated "nodes"
using various configurations (currently tunnel
type, IPv6/IPv6, authentication method, local_ip). After
configuration, connectivity between the two nodes is
tested and the underlay traffic is also inspected to
ensure the traffic is encrypted.

All IPsec system tests can be run by using the ipsec
keyword:

sudo make check-kernel TESTSUITEFLAGS='-k ipsec'

Signed-off-by: Mark Gray 
Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
---
v2: removed sleep, addressed libreswan path length bug, move
geneve comment
v3: added additional check that libreswan connections are active
 tests/automake.mk  |   3 +-
 tests/system-ipsec.at  | 406 +
 tests/system-kmod-testsuite.at |   1 +
 3 files changed, 409 insertions(+), 1 deletion(-)
 create mode 100644 tests/system-ipsec.at

diff --git a/tests/automake.mk b/tests/automake.mk
index 44a65849ccac..1a528aa394ff 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -173,6 +173,7 @@ SYSTEM_TESTSUITE_AT = \
tests/system-common-macros.at \
tests/system-layer3-tunnels.at \
tests/system-traffic.at \
+   tests/system-ipsec.at \
tests/system-interface.at
 
 SYSTEM_OFFLOADS_TESTSUITE_AT = \
@@ -200,7 +201,7 @@ SYSTEM_DPDK_TESTSUITE = 
$(srcdir)/tests/system-dpdk-testsuite
 OVSDB_CLUSTER_TESTSUITE = $(srcdir)/tests/ovsdb-cluster-testsuite
 DISTCLEANFILES += tests/atconfig tests/atlocal
 
-AUTOTEST_PATH = 
utilities:vswitchd:ovsdb:vtep:tests:$(PTHREAD_WIN32_DIR_DLL):$(SSL_DIR)
+AUTOTEST_PATH = 
utilities:vswitchd:ovsdb:vtep:tests:ipsec:$(PTHREAD_WIN32_DIR_DLL):$(SSL_DIR)
 
 check-local:
set $(SHELL) '$(TESTSUITE)' -C tests AUTOTEST_PATH=$(AUTOTEST_PATH); \
diff --git a/tests/system-ipsec.at b/tests/system-ipsec.at
new file mode 100644
index ..2cd0469f5d33
--- /dev/null
+++ b/tests/system-ipsec.at
@@ -0,0 +1,406 @@
+AT_BANNER(IPsec)
+
+dnl IPSEC_SETUP_UNDERLAY()
+dnl
+dnl Configure anything required in the underlay network
+m4_define([IPSEC_SETUP_UNDERLAY],
+  [AT_CHECK([cp ${abs_top_srcdir}/vswitchd/vswitch.ovsschema 
vswitch.ovsschema])
+  dnl Set up the underlay switch
+  AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])])
+
+dnl IPSEC_ADD_NODE([namespace], [device], [address], [peer address]))
+dnl
+dnl Creates a dummy host that acts as an IPsec endpoint. Creates host in
+dnl 'namespace' and attaches a veth 'device' to 'namespace' to act as the host
+dnl NIC. Assigns 'address' to 'device' and adds the other end of veth 'device' 
to
+dnl 'br0' which is an OVS bridge in the default namespace acting as an underlay
+dnl switch. Sets the default gateway of 'namespace' to 'peer address'.
+dnl
+dnl Starts all daemons in 'namespace' that are required for IPsec
+m4_define([IPSEC_ADD_NODE],
+  [ADD_NAMESPACES($1)
+  dnl Disable DAD. We know we wont get duplicates on this underlay network.
+  NS_EXEC([$1], [sysctl -w net.ipv6.conf.all.accept_dad=0])
+  NS_EXEC([$1], [sysctl -w net.ipv6.conf.default.accept_dad=0])
+  ADD_VETH($2, $1, br0, $3/24)
+  NS_EXEC([$1], [ip route add default via $4 dev $2])
+  mkdir -p $ovs_base/$1
+  touch $ovs_base/$1/.conf.db.~lock~
+  NS_EXEC([$1], [ovsdb-tool create $ovs_base/$1/conf.db \
+$abs_top_srcdir/vswitchd/vswitch.ovsschema], [0], [], [stderr])
+
+  dnl Start ovsdb-server.
+  NS_EXEC([$1],[ovsdb-server $ovs_base/$1/conf.db --detach --no-chdir \
+--log-file=$ovs_base/$1/ovsdb.log --pidfile=$ovs_base/$1/ovsdb.pid \
+--remote=punix:$OVS_RUNDIR/$1/db.sock], [0], [], [stderr])
+  on_exit "kill `cat $ovs_base/$1/ovsdb.pid`"
+  NS_EXEC([$1], [ovs-vsctl --no-wait init])
+
+  dnl Start ovs-vswitchd.
+  NS_EXEC([$1], [ovs-vswitchd unix:${OVS_RUNDIR}/$1/db.sock --detach \
+--no-chdir --pidfile=$ovs_base/$1/vswitchd.pid \
+--unixctl=$ovs_base/$1/vswitchd.ctl \
+--log-file=$ovs_base/$1/vswitchd.log -vvconn -vofproto_dpif 
-vunixctl],\
+[0], [], [stderr])
+  on_exit "kill_ovs_vswitchd `cat $ovs_base/$1/vswitchd.pid`"
+
+  dnl Start pluto
+  mkdir -p $ovs_base/$1/ipsec.d
+  touch $ovs_base/$1/ipsec.conf
+  touch $ovs_base/$1/secrets
+  ipsec initnss --nssdir $ovs_base/$1/ipsec.d
+  NS_CHECK_EXEC([$1], [ipsec pluto --config $ovs_base/$1/ipsec.conf \
+--ipsecdir $ovs_base/$1 --nssdir $ovs_base/$1/ipsec.d \
+--logfile $ovs_base/$1/pluto.log --secretsfile $ovs_base/$1/secrets \
+--rundir $ovs_base/$1], [0], [], [stderr])
+  on_exit "kill `cat $ovs_base/$1/pluto.pid`"
+
+  dnl Start ovs-monitor-ipse

[ovs-dev] [PATCH v3 3/5] ipsec: Allow custom file locations

2021-04-01 Thread Mark Gray
"ovs_monitor_ipsec" assumes certain file locations for a number
of Libreswan objects. This patch allows these locations to be
configurable at startup in the Libreswan case.

This additional flexibility enables system testing for
OVS IPsec.

Signed-off-by: Mark Gray 
Acked-by: Flavio Leitner 
Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
---
v2: removed unneeded '+' operator, moved libreswan arg parsing
 ipsec/ovs-monitor-ipsec.in | 103 -
 1 file changed, 80 insertions(+), 23 deletions(-)

diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in
index 668507fd37dd..a9542477577d 100755
--- a/ipsec/ovs-monitor-ipsec.in
+++ b/ipsec/ovs-monitor-ipsec.in
@@ -445,12 +445,26 @@ conn prevent_unencrypted_vxlan
 CERT_PREFIX = "ovs_cert_"
 CERTKEY_PREFIX = "ovs_certkey_"
 
-def __init__(self, libreswan_root_prefix):
+def __init__(self, libreswan_root_prefix, args):
+ipsec_conf = args.ipsec_conf if args.ipsec_conf else "/etc/ipsec.conf"
+ipsec_d = args.ipsec_d if args.ipsec_d else "/etc/ipsec.d"
+ipsec_secrets = (args.ipsec_secrets if args.ipsec_secrets
+else "/etc/ipsec.secrets")
+ipsec_ctl = (args.ipsec_ctl if args.ipsec_ctl
+else "/run/pluto/pluto.ctl")
+
 self.IPSEC = libreswan_root_prefix + "/usr/sbin/ipsec"
-self.IPSEC_CONF = libreswan_root_prefix + "/etc/ipsec.conf"
-self.IPSEC_SECRETS = libreswan_root_prefix + "/etc/ipsec.secrets"
+self.IPSEC_CONF = libreswan_root_prefix + ipsec_conf
+self.IPSEC_SECRETS = libreswan_root_prefix + ipsec_secrets
+self.IPSEC_D = "sql:" + libreswan_root_prefix + ipsec_d
+self.IPSEC_CTL = libreswan_root_prefix + ipsec_ctl
 self.conf_file = None
 self.secrets_file = None
+vlog.dbg("Using: " + self.IPSEC)
+vlog.dbg("Configuration file: " + self.IPSEC_CONF)
+vlog.dbg("Secrets file: " + self.IPSEC_SECRETS)
+vlog.dbg("ipsec.d: " + self.IPSEC_D)
+vlog.dbg("Pluto socket: " + self.IPSEC_CTL)
 
 def restart_ike_daemon(self):
 """This function restarts LibreSwan."""
@@ -548,7 +562,8 @@ conn prevent_unencrypted_vxlan
 
 def refresh(self, monitor):
 vlog.info("Refreshing LibreSwan configuration")
-subprocess.call([self.IPSEC, "auto", "--rereadsecrets"])
+subprocess.call([self.IPSEC, "auto", "--ctlsocket", self.IPSEC_CTL,
+"--config", self.IPSEC_CONF, "--rereadsecrets"])
 tunnels = set(monitor.tunnels.keys())
 
 # Delete old connections
@@ -575,7 +590,9 @@ conn prevent_unencrypted_vxlan
 
 if not tunnel or tunnel.version != ver:
 vlog.info("%s is outdated %u" % (conn, ver))
-subprocess.call([self.IPSEC, "auto", "--delete", conn])
+subprocess.call([self.IPSEC, "auto", "--ctlsocket",
+self.IPSEC_CTL, "--config",
+self.IPSEC_CONF, "--delete", conn])
 elif ifname in tunnels:
 tunnels.remove(ifname)
 
@@ -595,22 +612,46 @@ conn prevent_unencrypted_vxlan
 # Update shunt policy if changed
 if monitor.conf_in_use["skb_mark"] != monitor.conf["skb_mark"]:
 if monitor.conf["skb_mark"]:
-subprocess.call([self.IPSEC, "auto", "--add",
+subprocess.call([self.IPSEC, "auto",
+"--config", self.IPSEC_CONF,
+"--ctlsocket", self.IPSEC_CTL,
+"--add",
 "--asynchronous", "prevent_unencrypted_gre"])
-subprocess.call([self.IPSEC, "auto", "--add",
+subprocess.call([self.IPSEC, "auto",
+"--config", self.IPSEC_CONF,
+"--ctlsocket", self.IPSEC_CTL,
+"--add",
 "--asynchronous", "prevent_unencrypted_geneve"])
-subprocess.call([self.IPSEC, "auto", "--add",
+subprocess.call([self.IPSEC, "auto",
+"--config", self.IPSEC_CONF,
+"--ctlsocket", self.IPSEC_CTL,
+"--add",
 "--asynchronous", "prevent_unencrypted_stt"])
-subprocess.call([self.IPSEC, "auto", "--add",
+subprocess.call([self.IPSEC, "auto",
+"--config", self.IPSEC_CONF,
+"--ctlsocket", self.IPSEC_CTL,
+"--add",
 "--asynchronous", "prevent_unencrypted_vxlan"])
 else:
-subprocess.call([self.IPSEC, "auto", "--delete",
+subprocess.call([self.IPSEC, "auto",
+"--config", self.IPSEC_CONF,
+

[ovs-dev] [PATCH v3 1/5] ipsec: IPv6 default route support for Libreswan

2021-04-01 Thread Mark Gray
When configuring IPsec, "ovs-monitor-ipsec" honours
the 'local_ip' option in the 'Interface' table by configuring
the 'left' side of the Libreswan connection with 'local_ip'.
If 'local_ip' is not specified, "ovs-monitor-ipsec" sets
'left' to '%defaultroute' which is interpreted as the IP
address of the default gateway interface.

However, when 'remote_ip' is an IPv6 address, Libreswan
still interprets '%defaultroute' as the IPv4 address on the
default gateway interface (see:
https://github.com/libreswan/libreswan/issues/416) giving
an "address family inconsistency" error.

This patch resolves this issue by specifying the
connection as IPv6 when the 'remote_ip' is IPv6 and
'local_ip' has not been set.

Fixes: 22c5eafb6efa ("ipsec: reintroduce IPsec support for tunneling")
Signed-off-by: Mark Gray 
Acked-by: Flavio Leitner 
Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
---
v2: refactor address family parsing
v3: catch additional exception and add comment when getting
address family
 ipsec/ovs-monitor-ipsec.in | 37 +
 1 file changed, 37 insertions(+)

diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in
index 64111768b33a..668507fd37dd 100755
--- a/ipsec/ovs-monitor-ipsec.in
+++ b/ipsec/ovs-monitor-ipsec.in
@@ -14,6 +14,7 @@
 # limitations under the License.
 
 import argparse
+import ipaddress
 import re
 import subprocess
 import sys
@@ -413,6 +414,11 @@ conn prevent_unencrypted_vxlan
 leftprotoport=udp/4789
 mark={0}
 
+"""
+
+IPV6_CONN = """\
+hostaddrfamily=ipv6
+clientaddrfamily=ipv6
 """
 
 auth_tmpl = {"psk": Template("""\
@@ -520,6 +526,9 @@ conn prevent_unencrypted_vxlan
 else:
 auth_section = self.auth_tmpl["pki_ca"].substitute(tunnel.conf)
 
+if tunnel.conf["address_family"] == "IPv6":
+auth_section = self.IPV6_CONN + auth_section
+
 vals = tunnel.conf.copy()
 vals["auth_section"] = auth_section
 vals["version"] = tunnel.version
@@ -756,6 +765,7 @@ class IPsecTunnel(object):
   Tunnel Type:$tunnel_type
   Local IP:   $local_ip
   Remote IP:  $remote_ip
+  Address Family: $address_family
   SKB mark:   $skb_mark
   Local cert: $certificate
   Local name: $local_name
@@ -797,6 +807,9 @@ class IPsecTunnel(object):
 "tunnel_type": row.type,
 "local_ip": options.get("local_ip", "%defaultroute"),
 "remote_ip": options.get("remote_ip"),
+"address_family": self._get_conn_address_family(
+   options.get("remote_ip"),
+   options.get("local_ip")),
 "skb_mark": monitor.conf["skb_mark"],
 "certificate": monitor.conf["pki"]["certificate"],
 "private_key": monitor.conf["pki"]["private_key"],
@@ -865,6 +878,17 @@ class IPsecTunnel(object):
 
 return header + conf + status + spds + sas + cons + "\n"
 
+def _get_conn_address_family(self, remote_ip, local_ip):
+remote = address_family(remote_ip)
+local = address_family(local_ip)
+
+if local is None:
+return remote
+elif local != remote:
+return None
+else:
+return remote
+
 def _is_valid_tunnel_conf(self):
 """This function verifies if IPsec tunnel has valid configuration
 set in 'conf'.  If it is valid, then it returns True.  Otherwise,
@@ -1120,6 +1144,19 @@ class IPsecMonitor(object):
 return m.group(1)
 
 
+def address_family(address):
+try:
+ip = ipaddress.ip_address(address)
+ipstr = str(type(ip))
+# ipaddress has inconsistencies with what exceptions are raised:
+# https://mail.openvswitch.org/pipermail/ovs-dev/2021-April/381696.html
+except (ValueError, ipaddress.AddressValueError):
+return None
+if ipstr.find('v6') != -1:
+return "IPv6"
+return "IPv4"
+
+
 def unixctl_xfrm_policies(conn, unused_argv, unused_aux):
 global xfrm
 policies = xfrm.get_policies()
-- 
2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 0/5] ipsec: Fix IPv6 support

2021-04-01 Thread Mark Gray
This series fixes IPv6 support for Libreswan and introduces
IPsec system tests for Libreswan.

Mark Gray (5):
  ipsec: IPv6 default route support for Libreswan
  system-common-macros: clean up veth device on test failure
  ipsec: Allow custom file locations
  ipsec: Introduce IPsec system tests for Libreswan
  ipsec: Update ordering of imports

 ipsec/ovs-monitor-ipsec.in | 144 ++--
 tests/automake.mk  |   3 +-
 tests/system-common-macros.at  |   2 +-
 tests/system-ipsec.at  | 406 +
 tests/system-kmod-testsuite.at |   1 +
 5 files changed, 529 insertions(+), 27 deletions(-)
 create mode 100644 tests/system-ipsec.at

-- 
2.27.0


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 2/5] system-common-macros: clean up veth device on test failure

2021-04-01 Thread Mark Gray
'on_exit' should be run directly after creation
of veth device.

Fixes: 119db2cb18a7 ("kmod-macros: Move some code to traffic-common-macros.")
Signed-off-by: Mark Gray 
Acked-by: Eelco Chaudron 
Acked-by: Flavio Leitner 
Acked-by: Aaron Conole 
---
 tests/system-common-macros.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
index 9d5e24a2922b..19a0b125b973 100644
--- a/tests/system-common-macros.at
+++ b/tests/system-common-macros.at
@@ -72,6 +72,7 @@ m4_define([ADD_INT],
 #
 m4_define([ADD_VETH],
 [ AT_CHECK([ip link add $1 type veth peer name ovs-$1 || return 77])
+  on_exit 'ip link del ovs-$1'
   CONFIGURE_VETH_OFFLOADS([$1])
   AT_CHECK([ip link set $1 netns $2])
   AT_CHECK([ip link set dev ovs-$1 up])
@@ -85,7 +86,6 @@ m4_define([ADD_VETH],
   if test -n "$6"; then
 NS_CHECK_EXEC([$2], [ip route add default via $6])
   fi
-  on_exit 'ip link del ovs-$1'
 ]
 )
 
-- 
2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/5] ipsec: Introduce IPsec system tests for Libreswan

2021-04-01 Thread Mark Gray
On 31/03/2021 21:37, Flavio Leitner wrote:
> On Wed, Mar 31, 2021 at 04:05:08AM -0400, Mark Gray wrote:
>> This patch adds system tests for OVS IPsec using Libreswan.
>> If Libreswan is not present on the system, the tests will
>> be skipped.
>>
>> These tests set up an underlay switch with bridge 'br0'
>> to carry encrypted traffic between two emulated "nodes".
>> Each "node" is a seperate network namespace ('left' and
>> 'right') and runs an instance of the Libreswan "pluto"
>> daemon, ovs-monitor-ipsec, ovs-vswitch and ovsdb-server.
>>
>> Each test sets up IPsec between the two emulated "nodes"
>> using various configurations (currently tunnel
>> type, IPv6/IPv6, authentication method, local_ip). After
>> configuration, connectivity between the two nodes is
>> tested and the underlay traffic is also inspected to
>> ensure the traffic is encrypted.
>>
>> All IPsec system tests can be run by using the ipsec
>> keyword:
>>
>> sudo make check-kernel TESTSUITEFLAGS='-k ipsec'
>>
>> Signed-off-by: Mark Gray 
>> ---
>> v2: removed sleep, addressed libreswan path length bug, move
>> geneve comment
>>
>>  tests/automake.mk  |   3 +-
>>  tests/system-ipsec.at  | 400 +
>>  tests/system-kmod-testsuite.at |   1 +
>>  3 files changed, 403 insertions(+), 1 deletion(-)
>>  create mode 100644 tests/system-ipsec.at
>>
>> diff --git a/tests/automake.mk b/tests/automake.mk
>> index 44a65849ccac..1a528aa394ff 100644
>> --- a/tests/automake.mk
>> +++ b/tests/automake.mk
>> @@ -173,6 +173,7 @@ SYSTEM_TESTSUITE_AT = \
>>  tests/system-common-macros.at \
>>  tests/system-layer3-tunnels.at \
>>  tests/system-traffic.at \
>> +tests/system-ipsec.at \
>>  tests/system-interface.at
>>  
>>  SYSTEM_OFFLOADS_TESTSUITE_AT = \
>> @@ -200,7 +201,7 @@ SYSTEM_DPDK_TESTSUITE = 
>> $(srcdir)/tests/system-dpdk-testsuite
>>  OVSDB_CLUSTER_TESTSUITE = $(srcdir)/tests/ovsdb-cluster-testsuite
>>  DISTCLEANFILES += tests/atconfig tests/atlocal
>>  
>> -AUTOTEST_PATH = 
>> utilities:vswitchd:ovsdb:vtep:tests:$(PTHREAD_WIN32_DIR_DLL):$(SSL_DIR)
>> +AUTOTEST_PATH = 
>> utilities:vswitchd:ovsdb:vtep:tests:ipsec:$(PTHREAD_WIN32_DIR_DLL):$(SSL_DIR)
>>  
>>  check-local:
>>  set $(SHELL) '$(TESTSUITE)' -C tests AUTOTEST_PATH=$(AUTOTEST_PATH); \
>> diff --git a/tests/system-ipsec.at b/tests/system-ipsec.at
>> new file mode 100644
>> index ..7dc0f9228f95
>> --- /dev/null
>> +++ b/tests/system-ipsec.at
>> @@ -0,0 +1,400 @@
>> +AT_BANNER(IPsec)
>> +
>> +dnl IPSEC_SETUP_UNDERLAY()
>> +dnl
>> +dnl Configure anything required in the underlay network
>> +m4_define([IPSEC_SETUP_UNDERLAY],
>> +  [AT_CHECK([cp ${abs_top_srcdir}/vswitchd/vswitch.ovsschema 
>> vswitch.ovsschema])
>> +  dnl Set up the underlay switch
>> +  AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])])
>> +
>> +dnl IPSEC_ADD_NODE([namespace], [device], [address], [peer address]))
>> +dnl
>> +dnl Creates a dummy host that acts as an IPsec endpoint. Creates host in
>> +dnl 'namespace' and attaches a veth 'device' to 'namespace' to act as the 
>> host
>> +dnl NIC. Assigns 'address' to 'device' and adds the other end of veth 
>> 'device' to
>> +dnl 'br0' which is an OVS bridge in the default namespace acting as an 
>> underlay
>> +dnl switch. Sets the default gateway of 'namespace' to 'peer address'.
>> +dnl
>> +dnl Starts all daemons in 'namespace' that are required for IPsec
>> +m4_define([IPSEC_ADD_NODE],
>> +  [ADD_NAMESPACES($1)
>> +  dnl Disable DAD. We know we wont get duplicates on this underlay network.
>> +  NS_EXEC([$1], [sysctl -w net.ipv6.conf.all.accept_dad=0])
>> +  NS_EXEC([$1], [sysctl -w net.ipv6.conf.default.accept_dad=0])
>> +  ADD_VETH($2, $1, br0, $3/24)
>> +  NS_EXEC([$1], [ip route add default via $4 dev $2])
>> +  mkdir -p $ovs_base/$1
>> +  touch $ovs_base/$1/.conf.db.~lock~
>> +  NS_EXEC([$1], [ovsdb-tool create $ovs_base/$1/conf.db \
>> +$abs_top_srcdir/vswitchd/vswitch.ovsschema], [0], [], 
>> [stderr])
>> +
>> +  dnl Start ovsdb-server.
>> +  NS_EXEC([$1],[ovsdb-server $ovs_base/$1/conf.db --detach --no-chdir \
>> +--log-file=$ovs_base/$1/ovsdb.log --pidfile=$ovs_base/$1/ovsdb.pid \
>> +--remote=punix:$OVS_RUNDIR/$1/db.sock], [0], [], [stderr])
>> +  on_exit "kill `cat $ovs_base/$1/ovsdb.pid`"
>> +  NS_EXEC([$1], [ovs-vsctl --no-wait init])
>> +
>> +  dnl Start ovs-vswitchd.
>> +  NS_EXEC([$1], [ovs-vswitchd unix:${OVS_RUNDIR}/$1/db.sock --detach \
>> +--no-chdir --pidfile=$ovs_base/$1/vswitchd.pid \
>> +--unixctl=$ovs_base/$1/vswitchd.ctl \
>> +--log-file=$ovs_base/$1/vswitchd.log -vvconn -vofproto_dpif 
>> -vunixctl],\
>> +[0], [], [stderr])
>> +  on_exit "kill_ovs_vswitchd `cat $ovs_base/$1/vswitchd.pid`"
>> +
>> +  dnl Start pluto
>> +  mkdir -p $ovs_base/$1/ipsec.d
>> +  touch $ovs_base/$1/ipsec.conf
>> +  touch $ovs_base/$1/secrets
>> +  ipsec initnss --nssdir $ovs_base/$1/ipsec.d
>> +  NS_CHECK_EX

Re: [ovs-dev] [PATCH 3/4] ipsec: IPv6 default route support for Libreswan

2021-04-01 Thread Mark Gray
On 01/04/2021 09:26, Ilya Maximets wrote:
> On 3/30/21 6:15 PM, Mark Gray wrote:
>> On 30/03/2021 15:28, Aaron Conole wrote:
>>> Mark Gray  writes:
>>>
 When configuring IPsec, "ovs-monitor-ipsec" honours
 the 'local_ip' option in the 'Interface' table by configuring
 the 'left' side of the Libreswan connection with 'local_ip'.
 If 'local_ip' is not specified, "ovs-monitor-ipsec" sets
 'left' to '%defaultroute' which is interpreted as the IP
 address of the default gateway interface.

 However, when 'remote_ip' is an IPv6 address, Libreswan
 still interprets '%defaultroute' as the IPv4 address on the
 default gateway interface (see:
 https://github.com/libreswan/libreswan/issues/416) giving
 an "address family inconsistency" error.

 This patch resolves this issue by specifying the
 connection as IPv6 when the 'remote_ip' is IPv6 and
 'local_ip' has not been set.

 Signed-off-by: Mark Gray 
 ---
  ipsec/ovs-monitor-ipsec.in | 54 +-
  1 file changed, 53 insertions(+), 1 deletion(-)

 diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in
 index 9f412aaaf25a..b8cfb0a8ae79 100755
 --- a/ipsec/ovs-monitor-ipsec.in
 +++ b/ipsec/ovs-monitor-ipsec.in
 @@ -14,10 +14,11 @@
  # limitations under the License.
  
  import argparse
 +import copy
>>>
>>> I think it's okay to get things in alphabetical order, but it's not
>>> related.
>>
>> I'll change back
>>
>>>
 +import ipaddress
  import re
  import subprocess
  import sys
 -import copy
  import os
  from string import Template
  
 @@ -413,6 +414,11 @@ conn prevent_unencrypted_vxlan
  leftprotoport=udp/4789
  mark={0}
  
 +"""
 +
 +IPV6_CONN = """\
 +hostaddrfamily=ipv6
 +clientaddrfamily=ipv6
  """
  
  auth_tmpl = {"psk": Template("""\
 @@ -528,6 +534,9 @@ conn prevent_unencrypted_vxlan
  else:
  auth_section = 
 self.auth_tmpl["pki_ca"].substitute(tunnel.conf)
  
 +if tunnel.conf["address_family"] == "IPv6":
 +auth_section = self.IPV6_CONN + auth_section
 +
  vals = tunnel.conf.copy()
  vals["auth_section"] = auth_section
  vals["version"] = tunnel.version
 @@ -795,6 +804,7 @@ class IPsecTunnel(object):
Tunnel Type:$tunnel_type
Local IP:   $local_ip
Remote IP:  $remote_ip
 +  Address Family: $address_family
SKB mark:   $skb_mark
Local cert: $certificate
Local name: $local_name
 @@ -836,6 +846,9 @@ class IPsecTunnel(object):
  "tunnel_type": row.type,
  "local_ip": options.get("local_ip", "%defaultroute"),
  "remote_ip": options.get("remote_ip"),
 +"address_family": self._get_conn_address_family(
 +   
 options.get("remote_ip"),
 +   
 options.get("local_ip")),
  "skb_mark": monitor.conf["skb_mark"],
  "certificate": monitor.conf["pki"]["certificate"],
  "private_key": monitor.conf["pki"]["private_key"],
 @@ -904,6 +917,24 @@ class IPsecTunnel(object):
  
  return header + conf + status + spds + sas + cons + "\n"
  
 +def _get_conn_address_family(self, remote_ip, local_ip):
 +remote = address_family(remote_ip)
 +local = address_family(local_ip)
 +
 +if local == "IPv4" and remote == "IPv4":
 +return "IPv4"
 +elif local == "IPv6" and remote == "IPv6":
 +return "IPv6"
 +elif remote == "IPv4" and local_ip is None:
 +return "IPv4"
 +elif remote == "IPv6" and local_ip is None:
 +return "IPv6"
 +elif remote != local:
 +# remote family and local family are mismatched
 +return None
 +else:
 +return None
 +
>>>
>>> I think we can shrink this whole section to:
>>>
>>>
>>> def _get_conn_address_family(self, remote_ip, local_ip):
>>> remote = address_family(remote_ip)
>>> local = address_family(local_ip)
>>>
>>> if local is None:
>>>return remote
>>> elif local != remote:
>>>return None
>>>
>>> return remote
>>
>> Yes, you are right. Simpler.
>>>
>>>
  def _is_valid_tunnel_conf(self):
  """This function verifies if IPsec tunnel has valid configuration
  set in 'conf'.  If it is valid, then it returns True.  Otherwise,
 @@ -1160,6 +1191,27 @@ class IPsecMonitor(object):
  
  return m.group(1)
  
 +def is_ipv4(address):
 +

Re: [ovs-dev] [PATCH v2 3/5] ipsec: IPv6 default route support for Libreswan

2021-04-01 Thread Mark Gray
On 01/04/2021 09:36, Ilya Maximets wrote:
> On 3/31/21 10:05 AM, Mark Gray wrote:
>> When configuring IPsec, "ovs-monitor-ipsec" honours
>> the 'local_ip' option in the 'Interface' table by configuring
>> the 'left' side of the Libreswan connection with 'local_ip'.
>> If 'local_ip' is not specified, "ovs-monitor-ipsec" sets
>> 'left' to '%defaultroute' which is interpreted as the IP
>> address of the default gateway interface.
>>
>> However, when 'remote_ip' is an IPv6 address, Libreswan
>> still interprets '%defaultroute' as the IPv4 address on the
>> default gateway interface (see:
>> https://github.com/libreswan/libreswan/issues/416) giving
>> an "address family inconsistency" error.
>>
>> This patch resolves this issue by specifying the
>> connection as IPv6 when the 'remote_ip' is IPv6 and
>> 'local_ip' has not been set.
>>
>> Signed-off-by: Mark Gray 
>> ---
>> v2: refactor address family parsing
>>  ipsec/ovs-monitor-ipsec.in | 35 +++
>>  1 file changed, 35 insertions(+)
> 
> Beside the comment I made on the previous version, this patch
> looks like a bug fix unlike others in the series.  Is there
> a reason why it placed in the middle of the set?  Does it have
> any dependency on previous patches?
> 

No, I changed the order which should make it easier for your to
backport. I also added a "Fixes" tag.

> Best regards, Ilya Maximets.
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovs v1] tunnel: Remove the padding from packet when encapsulating.

2021-04-01 Thread Tonghao Zhang
On Mon, Dec 14, 2020 at 11:11 AM  wrote:
>
> From: Tonghao Zhang 
>
> The root cause is that the old version of openvswitch doesn't
> remove the padding from packet before L3+ conntrack processing
> and then packets is dropped in linux kernel stack. The patch [1]
> fixes the issue. We fix this issue on gateway which running ovs-dpdk
> as a quick workaround. Padding should be removed because tunnel size
> + inner size > 64B. More detailes, see [1]
>
> [1] - 
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=9382fe71c0058465e942a633869629929102843d
> Signed-off-by: Tonghao Zhang 
ping :)
> ---
>  lib/netdev-native-tnl.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> index b89dfdd52..acfbb13c4 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -149,11 +149,15 @@ void *
>  netdev_tnl_push_ip_header(struct dp_packet *packet,
> const void *header, int size, int *ip_tot_size)
>  {
> +int padding = dp_packet_l2_pad_size(packet);
>  struct eth_header *eth;
>  struct ip_header *ip;
>  struct ovs_16aligned_ip6_hdr *ip6;
>
>  eth = dp_packet_push_uninit(packet, size);
> +if (padding) {
> +dp_packet_set_size(packet, dp_packet_size(packet) - padding);
> +}
>  *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);
>
>  memcpy(eth, header, size);
> --
> 2.14.1
>


-- 
Best regards, Tonghao
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 0/3] ovsdb-idl: Preserve references for tracked deleted rows.

2021-04-01 Thread Ilya Maximets
On 3/28/21 6:23 AM, Han Zhou wrote:
> For the series:
> Acked-by: Han Zhou mailto:hz...@ovn.org>>
> 
> On Wed, Mar 24, 2021 at 2:33 AM Dumitru Ceara  > wrote:
> 
> Patch 1/3 of the series makes the ovsdb-idl tests more future proof
> by trying to ensure more predictable output from test-ovsdb.
> 
> Paches 2/3 and 3/3 fix problems in the IDL change tracking code.
> 
> Changes in v4:
> - Patch 1/3:
>   - Rebase.
>   - Readd UUID to test-ovsdb.py output.
>   - Fix indentation in test-ovsdb.c.
> - Patch 2/3:
>   - Rename orphan_rows to deleted_untracked_rows.
>   - Rename ovsdb_idl_process_orphans() to ovsdb_idl_reparse_deleted().
>   - Revert changes to ovsdb_idl_row_reparse_backrefs().
>   - Unified test-ovsdb.c and test-ovsdb.py output for simple3's uset and
>     uref columns.
>   - Added two more tests for deletion of strong references due to monitor
>     condition change.
> - Patch 3/3:
>   - Rebase.
> 
> Changes in v3:
> - Patch 1/3:
>   - Changed expected output of ovsdb-cluster.at  
> to reflect the new
>     formatting in test-ovsdb output.
>   - Fixed typo in test-ovsdb.py.
> - Patch 2/3:
>   - Rework based on the discussion with Ilya.
>   - Added more tests.
> - Add patch 3/3:
>   - Mark reference sources as "udpated" when destinations are deleted.
> 
> Changes in v2:
> - Patch 1/2:
>   - reworked the patch to improve the output of test-ovsdb.c and
>     test-ovsdb.py themselves.
> - Patch 2/2:
>   - added a test for strong references.
> 
> Dumitru Ceara (3):
>       ovsdb-idl.at : Make test outputs more 
> predictable.
>       ovsdb-idl: Preserve references for deleted rows.
>       ovsdb-idl: Mark arc sources as updated when destination is deleted.
> 
> 
>  lib/ovsdb-idl.c        |  137 +++--
>  lib/ovsdb-idl.h        |    2
>  tests/ovsdb-cluster.at  |    2
>  tests/ovsdb-idl.at      |  747 
> 
>  tests/test-ovsdb.c     |  246 +++-
>  tests/test-ovsdb.py    |  119 +---
>  6 files changed, 861 insertions(+), 392 deletions(-)
> 

Thanks, Dumitru and Han!
I applied this series to master and branch-2.15.

It looks like we should have these fixes on LTS, but patches
are not directly applicable.  Dumitru, could you prepare backports?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] conntrack: handle SNAT with all-zero IP address

2021-04-01 Thread 0-day Robot
Bleep bloop.  Greetings Paolo Valerio, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0001 conntrack: handle SNAT with all-zero IP address
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] expr: Combine multiple ipv4 with wildcard mask.(Internet mail)

2021-04-01 Thread 陈供明
Correct some words


On 2021/4/1, 8:17 PM, "gmingchen(陈供明)"  wrote:



On 2021/3/30, 4:59 PM, "Mark Gray"  wrote:

On 29/03/2021 13:30, gmingchen(陈供明) wrote:
> 
> On 2021/3/25, 11:30 PM, "Mark Gray"  wrote:
> 
> On 19/03/2021 13:01, Dumitru Ceara wrote:
> > On 3/10/21 2:29 PM, gmingchen(陈供明) wrote:
> >> From: Gongming Chen 
> >>
> 
> Thanks for the patch. Looks like a lot of thought went into the
> algorithm and it has been interesting to review.
> 
> Do you know if there are any well-known algorithms to do this? It 
seems
> like a common problem? If there was, it may be better to use it 
as we
> could reference standard documentation.
> 
> Hi Mark,
> First of all, thanks for the review.
> 
> Unfortunately, I did not find a well-known algorithms.

Ok. Thats a shame. It's kind of like IP subnetting so I thought there
may be something.

> 
> >> This patch merges ipv4 addresses with wildcard masks, and 
replaces this
> >> ipv4 addresses with the combined ip/mask. This will greatly 
reduce the
> >> entries in the ovs security group flow table, especially when 
the host
> >> size is large.
> >>
> >> Analysis in the simplest scenario, a network 1.1.1.0/24 
network,create 253
> >> ports(1.1.1.2-1.1.1.254).
> >> Only focus on the number of ip addresses, the original 253 
addresses will
> >> be combined into 13 addresses.
> >> 1.1.1.2/31
> >> 1.1.1.4/30
> >> 1.1.1.8/29
> >> 1.1.1.16/28
> >> 1.1.1.32/27
> >> 1.1.1.64/26
> >> 1.1.1.128/26
> >> 1.1.1.192/27
> >> 1.1.1.224/28
> >> 1.1.1.240/29
> >> 1.1.1.248/30
> >> 1.1.1.252/31
> >> 1.1.1.254
> >>
> >> Some scenes are similar to the following:
> >> 1.1.1.2, 1.1.1.6
> >> After the combine:
> >> 1.1.1.2/255.255.255.251
> >> You can use ovn-match-ip utility to match ip.
> >> such as:
> >> ovs-ofctl dump-flows br-int | ovn-match-ip 1.1.1.6
> >> 1.1.1.2/255.255.255.251 will show.
> >>
> >> Simple description of the algorithm.
> >> There are two situations
> >> 1. Combine once
> >> such as:
> >> 1.1.1.0 1.1.1.1 1.0.1.0 1.0.1.1
> >> Combined into: 1.1.1.0/31, 1.0.1.0/31
> >> 2. Combine multiple times
> >> 1.1.1.0 1.1.1.1 1.0.1.0 1.0.1.1
> >> Combined into: 1.0.1.0/255.254.255.254
> >>
> >> Considering the actual scene and simplicity, the first case is 
used to
> >> combine once.
> >>
> >> ...00...
> >> ...01...
> >> ...10...
> >> ...11...
> >> "..." means the same value omitted.
> >> Obviously, the above value can be expressed as 
...00.../11100111. This
> >> continuous interval that can be represented by one or several 
wildcard
> >> masks is called a segment.
> >> Only if all 2< >> exist, can they be combined into 00...(n)/00...( n)
> >>
> >> First sort all the values by size. Iterate through each value.
> >> 1. Find a new segment, where two values differ only by 1 bit, 
such as
> >> ...0... and ...1...
> >> diff = ip_next ^ ip
> >> if (diff & (diff-1)) == 0
> >> new_segment = true
> >> The first non-zero place in the high direction of ip is the 
end of the
> >> segment(segment_end).
> >> For example...100... and...101..., the segment_end is ...111...
> >>
> >> 2. Count the number of consecutive and less than 
continuous_size in the
> >> segment.
> >> diff = ip_next - ip
> >> if (diff & (diff-1)) == 0 && ip_next <= segment_end
> >> continuous_size++
> >>
> >> 3. Combine different ip intervals in the segment according to
> >> continuous_size.
> >> In continuous_size, from the highest bit of 1 to the lowest 
bit of 1, in
> >> the order of segment start, each bit that is 1 is a different 
ip interval
> >> that can be combined with a wildcard mask.
> >> For example, 000, 001, 010:
> >> continuous_size: 3 (binary 11), segment_start: 000
> >> mask: ~(1 << 1 - 1) = 110; ~(1 << 0 - 1) = 111;
> >> Combined to: 000/110, 010/111
> 

Re: [ovs-dev] [PATCH ovn] expr: Combine multiple ipv4 with wildcard mask.(Internet mail)

2021-04-01 Thread 陈供明


On 2021/3/30, 4:59 PM, "Mark Gray"  wrote:

On 29/03/2021 13:30, gmingchen(陈供明) wrote:
> 
> On 2021/3/25, 11:30 PM, "Mark Gray"  wrote:
> 
> On 19/03/2021 13:01, Dumitru Ceara wrote:
> > On 3/10/21 2:29 PM, gmingchen(陈供明) wrote:
> >> From: Gongming Chen 
> >>
> 
> Thanks for the patch. Looks like a lot of thought went into the
> algorithm and it has been interesting to review.
> 
> Do you know if there are any well-known algorithms to do this? It 
seems
> like a common problem? If there was, it may be better to use it as we
> could reference standard documentation.
> 
> Hi Mark,
> First of all, thanks for the review.
> 
> Unfortunately, I did not find a well-known algorithms.

Ok. Thats a shame. It's kind of like IP subnetting so I thought there
may be something.

> 
> >> This patch merges ipv4 addresses with wildcard masks, and replaces 
this
> >> ipv4 addresses with the combined ip/mask. This will greatly reduce 
the
> >> entries in the ovs security group flow table, especially when the 
host
> >> size is large.
> >>
> >> Analysis in the simplest scenario, a network 1.1.1.0/24 
network,create 253
> >> ports(1.1.1.2-1.1.1.254).
> >> Only focus on the number of ip addresses, the original 253 
addresses will
> >> be combined into 13 addresses.
> >> 1.1.1.2/31
> >> 1.1.1.4/30
> >> 1.1.1.8/29
> >> 1.1.1.16/28
> >> 1.1.1.32/27
> >> 1.1.1.64/26
> >> 1.1.1.128/26
> >> 1.1.1.192/27
> >> 1.1.1.224/28
> >> 1.1.1.240/29
> >> 1.1.1.248/30
> >> 1.1.1.252/31
> >> 1.1.1.254
> >>
> >> Some scenes are similar to the following:
> >> 1.1.1.2, 1.1.1.6
> >> After the combine:
> >> 1.1.1.2/255.255.255.251
> >> You can use ovn-match-ip utility to match ip.
> >> such as:
> >> ovs-ofctl dump-flows br-int | ovn-match-ip 1.1.1.6
> >> 1.1.1.2/255.255.255.251 will show.
> >>
> >> Simple description of the algorithm.
> >> There are two situations
> >> 1. Combine once
> >> such as:
> >> 1.1.1.0 1.1.1.1 1.0.1.0 1.0.1.1
> >> Combined into: 1.1.1.0/31, 1.0.1.0/31
> >> 2. Combine multiple times
> >> 1.1.1.0 1.1.1.1 1.0.1.0 1.0.1.1
> >> Combined into: 1.0.1.0/255.254.255.254
> >>
> >> Considering the actual scene and simplicity, the first case is 
used to
> >> combine once.
> >>
> >> ...00...
> >> ...01...
> >> ...10...
> >> ...11...
> >> "..." means the same value omitted.
> >> Obviously, the above value can be expressed as ...00.../11100111. 
This
> >> continuous interval that can be represented by one or several 
wildcard
> >> masks is called a segment.
> >> Only if all 2< >> exist, can they be combined into 00...(n)/00...( n)
> >>
> >> First sort all the values by size. Iterate through each value.
> >> 1. Find a new segment, where two values differ only by 1 bit, such 
as
> >> ...0... and ...1...
> >> diff = ip_next ^ ip
> >> if (diff & (diff-1)) == 0
> >> new_segment = true
> >> The first non-zero place in the high direction of ip is the end of 
the
> >> segment(segment_end).
> >> For example...100... and...101..., the segment_end is ...111...
> >>
> >> 2. Count the number of consecutive and less than continuous_size 
in the
> >> segment.
> >> diff = ip_next - ip
> >> if (diff & (diff-1)) == 0 && ip_next <= segment_end
> >> continuous_size++
> >>
> >> 3. Combine different ip intervals in the segment according to
> >> continuous_size.
> >> In continuous_size, from the highest bit of 1 to the lowest bit of 
1, in
> >> the order of segment start, each bit that is 1 is a different ip 
interval
> >> that can be combined with a wildcard mask.
> >> For example, 000, 001, 010:
> >> continuous_size: 3 (binary 11), segment_start: 000
> >> mask: ~(1 << 1 - 1) = 110; ~(1 << 0 - 1) = 111;
> >> Combined to: 000/110, 010/111
> >>
> >> 4. The ip that cannot be recorded in a segment will not be 
combined.
> >>
> >> Signed-off-by: Gongming Chen 
> >> ---
> > 
> > Hi Gongming,
> > 
> > Sorry for the delayed review.
> > 
> > I have a few general remarks/concerns and some specific comments 
inline.
> >  First, the general remarks.
> > 
> > I'm wondering if it would make more sense for this wildcard 
combination

Re: [ovs-dev] [PATCH RESEND ovs v3 4/4] dpif: Don't set "burst_size" to "rate" if not specified.

2021-04-01 Thread Tonghao Zhang
On Wed, Mar 31, 2021 at 10:26 AM Tonghao Zhang  wrote:
>
> On Wed, Mar 31, 2021 at 6:29 AM Jean Tourrilhes  
> wrote:
> >
> > On Tue, Mar 30, 2021 at 02:27:11PM -0700, Ben Pfaff wrote:
> > > On Tue, Mar 30, 2021 at 11:16:48PM +0200, Ilya Maximets wrote:
> > > >
> > > > OpenFlow spec is a bit loose in definition of what should
> > > > be behavior if burst is not set:
> > > > """
> > > > If the flag OFPMF_BURST is not set the burst_size values from meter
> > > > bands are ignored, and if the meter implementation uses a burst value,
> > > > this burst value must be set to an implementation defined optimal value.
> > > > """
> > > >
> > > > In our case, historically, "implementation defined optimal value" was
> > > > value equal to rate.  I have no idea why, but it's hard to argue with
> > > > it since the spec gives a great freedom to choose.
> > > >
> > > > Actually, the "burst" itself as a term makes very little sense to me.
> > > > It's defined by the spec as:
> > > > """
> > > > It defines the granularity of the meter band, for all packet or byte
> > > > bursts whose length is greater than burst value, the meter rate will
> > > > always be strictly enforced.
> > > > """
> > > >
> > > > But what is the burst?  How the implementation should define which
> > > > packets are in the burst and which are from the next one?
> > > >
> > > > Current implementation just assumes that bursts are measured per second.
> > > > But the rate is measured per second too.  So, burst and rate is
> > > > essentially the same thing and implementations just sums them together
> > > > to get the bucket size.  So, I do not understand why "burst" and
> > > > "burst_size" exist at all.  Why not just set the rate a bit higher?
> > > >
> > > > Ben, can you shed some light on this?  What was the original idea
> > > > behind the meter burst?  Or maybe I'm missing something?
> >
> > I don't understand how you can confuse a rate and a size. The
> > OpenFlow spec clearly says it's in kilobits or packets (not per
> > seconds).
> > A basic token bucket has only two parameters, the commited
> > rate and the burst size (i.e. maximum number of tokens in the
> > bucket). The spec reflect that in a generic way to avoid mandating an
> > implementation.
> > Burst rate is only defined for more fancy rate limiters, such
> > as two colors rate limiters. In this case, you also have two burst
> > size, one for each token bucket. The OpenFlow spec does not support
> > those extra parameters (as of version 1.5.1).
> > For Linux 'police' filter : rate == rate ; burst_size == burst
> > For Linux 'htb' qdisc : rate == rate ; burst_size == burst ;
> Yes, we do also this in the kernel datapath, but not userspace datapath.
> > ceil and cburst are not supported.
> >
> > > I wasn't really involved in the design of meters.  I saw them as a
> > > feature of hardware switches that was not very relevant to software
> > > switches.  I guess I was wrong.
> > >
> > > I think that Jean Tourillhes was the primary architect of meters in
> > > OpenFlow.  I am adding him to this thread.  I hope that he can help.
> >
> > Have fun...
> >
> > Jean
Hi Ben, Ilya
Try to explain this patch again. Now OvS has supported the burst_size,
 as one user case,
if users don't use the burst_size feature, we should set burst_size to
rate or 0. This patch set this to 0.

As Ilya said, we should check the OFPMF13_BURST in userspace datapath,
I think it's right.

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 251788b04965..afd698be1a59 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -6349,13 +6349,12 @@ dpif_netdev_meter_set(struct dpif *dpif,
ofproto_meter_id meter_id,
 for (i = 0; i < config->n_bands; ++i) {
 uint32_t band_max_delta_t;

-/* Set burst size to a workable value if none specified. */
-if (config->bands[i].burst_size == 0) {
-config->bands[i].burst_size = config->bands[i].rate;
+/* Set burst size to a workable value if specified. */
+if (config->flags & OFPMF13_BURST) {
+meter->bands[i].burst_size = config->bands[i].burst_size;
 }

 meter->bands[i].rate = config->bands[i].rate;
-meter->bands[i].burst_size = config->bands[i].burst_size;
 /* Start with a full bucket. */
 meter->bands[i].bucket =
 (meter->bands[i].burst_size + meter->bands[i].rate) * 1000ULL;
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index ceb56c6851c6..f3db0c6802b9 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -3761,7 +3761,7 @@ dpif_netlink_meter_set__(struct dpif *dpif_,
ofproto_meter_id meter_id,
 nl_msg_put_u32(&buf, OVS_BAND_ATTR_RATE, band->rate);
 nl_msg_put_u32(&buf, OVS_BAND_ATTR_BURST,
config->flags & OFPMF13_BURST ?
-   band->burst_size : band->rate);
+   band->burst_size : 0);
 nl_msg_end_nested(&buf, b

Re: [ovs-dev] [PATCH ovn] northd: Restore flows that recirculate packets in the router DNAT zone.

2021-04-01 Thread Dumitru Ceara
On 4/1/21 12:23 PM, Numan Siddique wrote:
> On Thu, Apr 1, 2021 at 3:22 PM Lorenzo Bianconi
>  wrote:
>>
>>>
>>> Also improve the tests to make sure this doesn't break again in the
>>> future.
>>>
>>> Fixes: 225426081f85 ("northd: introduce build_lrouter_in_dnat_flow routine")
>>> CC: Lorenzo Bianconi 
>>> Reported-by: Ilya Maximets 
>>> Signed-off-by: Dumitru Ceara 
>>> ---
>>> Note: the regression was only in the ovn-northd C implementation, there
>>> are no ovn-northd-ddlog changes required.
>>> ---
>>>  northd/ovn-northd.c | 26 --
>>>  tests/ovn-northd.at | 40 +++-
>>>  2 files changed, 39 insertions(+), 27 deletions(-)
>>
>> Thx for fixing this:
>>
>> Acked-by: Lorenzo Bianconi 
> 
> Thanks Ilya and Dumitru for reporting and fixing this.
> 
> I applied this patch to master.
> 
> Numan
> 

Thanks!

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] conntrack: handle SNAT with all-zero IP address

2021-04-01 Thread Paolo Valerio
this patch introduces for the userspace datapath the handling
of rules like the following:

ct(commit,nat(src=0.0.0.0),...)

Kernel datapath already handle this case that is particularly
handy in scenarios like the following:

Given A: 10.1.1.1, B: 192.168.2.100, C: 10.1.1.2

A opens a connection toward B on port 80 selecting as source port 1.
B's IP gets dnat'ed to C's IP (10.1.1.1:1 -> 192.168.2.100:80).

This will result in:

tcp,orig=(src=10.1.1.1,dst=192.168.2.100,sport=1,dport=80),reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=1),protoinfo=(state=ESTABLISHED)

A now tries to establish another connection with C using source port
1, this time using C's IP address (10.1.1.1:1 -> 10.1.1.2:80).

This second connection, if processed by conntrack with no SNAT/DNAT
involved, collides with the reverse tuple of the first connection,
so the entry for this valid connection doesn't get created.

With this commit, and adding a SNAT rule with 0.0.0.0 for
10.1.1.1:1 -> 10.1.1.2:80 will allow to create the conn entry:

tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=80),reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=10001),protoinfo=(state=ESTABLISHED)
tcp,orig=(src=10.1.1.1,dst=192.168.2.100,sport=1,dport=80),reply=(src=10.1.1.2,dst=10.1.1.1,sport=80,dport=1),protoinfo=(state=ESTABLISHED)

The issue exists even in the opposite case (with A trying to connect
to C using B's IP after establishing a direct connection from A to C).

This commit refactors the relevant function in a way that both of the
previously mentioned cases are handled as well.

Suggested-by: Eelco Chaudron 
Signed-off-by: Paolo Valerio 
---
v2: enable NULL SNAT self-test also for userspace.
v3: replace NULL with all-zero in the commit message

Note for the maintainers:
the patch depends on the following:

https://patchwork.ozlabs.org/project/openvswitch/patch/161710710690.181407.5749135681436588686.stgit@ebuild/

 lib/conntrack.c  |  338 --
 lib/conntrack.h  |   15 ++
 tests/system-userspace-macros.at |7 -
 3 files changed, 228 insertions(+), 132 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 99198a601..664542c2d 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -108,9 +108,8 @@ static void set_label(struct dp_packet *, struct conn *,
 static void *clean_thread_main(void *f_);
 
 static bool
-nat_select_range_tuple(struct conntrack *ct, const struct conn *conn,
-   struct conn *nat_conn);
-
+nat_get_unique_tuple(struct conntrack *ct, const struct conn *conn,
+ struct conn *nat_conn);
 static uint8_t
 reverse_icmp_type(uint8_t type);
 static uint8_t
@@ -728,11 +727,11 @@ pat_packet(struct dp_packet *pkt, const struct conn *conn)
 }
 } else if (conn->nat_info->nat_action & NAT_ACTION_DST) {
 if (conn->key.nw_proto == IPPROTO_TCP) {
-struct tcp_header *th = dp_packet_l4(pkt);
-packet_set_tcp_port(pkt, th->tcp_src, conn->rev_key.src.port);
+packet_set_tcp_port(pkt, conn->rev_key.dst.port,
+conn->rev_key.src.port);
 } else if (conn->key.nw_proto == IPPROTO_UDP) {
-struct udp_header *uh = dp_packet_l4(pkt);
-packet_set_udp_port(pkt, uh->udp_src, conn->rev_key.src.port);
+packet_set_udp_port(pkt, conn->rev_key.dst.port,
+conn->rev_key.src.port);
 }
 }
 }
@@ -786,11 +785,9 @@ un_pat_packet(struct dp_packet *pkt, const struct conn 
*conn)
 }
 } else if (conn->nat_info->nat_action & NAT_ACTION_DST) {
 if (conn->key.nw_proto == IPPROTO_TCP) {
-struct tcp_header *th = dp_packet_l4(pkt);
-packet_set_tcp_port(pkt, conn->key.dst.port, th->tcp_dst);
+packet_set_tcp_port(pkt, conn->key.dst.port, conn->key.src.port);
 } else if (conn->key.nw_proto == IPPROTO_UDP) {
-struct udp_header *uh = dp_packet_l4(pkt);
-packet_set_udp_port(pkt, conn->key.dst.port, uh->udp_dst);
+packet_set_udp_port(pkt, conn->key.dst.port, conn->key.src.port);
 }
 }
 }
@@ -810,12 +807,10 @@ reverse_pat_packet(struct dp_packet *pkt, const struct 
conn *conn)
 }
 } else if (conn->nat_info->nat_action & NAT_ACTION_DST) {
 if (conn->key.nw_proto == IPPROTO_TCP) {
-struct tcp_header *th_in = dp_packet_l4(pkt);
-packet_set_tcp_port(pkt, th_in->tcp_src,
+packet_set_tcp_port(pkt, conn->key.src.port,
 conn->key.dst.port);
 } else if (conn->key.nw_proto == IPPROTO_UDP) {
-struct udp_header *uh_in = dp_packet_l4(pkt);
-packet_set_udp_port(pkt, uh_in->udp_src,
+packet_set_udp_port(pkt, conn->key.src.port,
 conn->key.dst.port);
 }
 }
@@ -1029,14 +1024,14 @@ conn_not_found(str

Re: [ovs-dev] [PATCH ovn] northd: Restore flows that recirculate packets in the router DNAT zone.

2021-04-01 Thread Numan Siddique
On Thu, Apr 1, 2021 at 3:22 PM Lorenzo Bianconi
 wrote:
>
> >
> > Also improve the tests to make sure this doesn't break again in the
> > future.
> >
> > Fixes: 225426081f85 ("northd: introduce build_lrouter_in_dnat_flow routine")
> > CC: Lorenzo Bianconi 
> > Reported-by: Ilya Maximets 
> > Signed-off-by: Dumitru Ceara 
> > ---
> > Note: the regression was only in the ovn-northd C implementation, there
> > are no ovn-northd-ddlog changes required.
> > ---
> >  northd/ovn-northd.c | 26 --
> >  tests/ovn-northd.at | 40 +++-
> >  2 files changed, 39 insertions(+), 27 deletions(-)
>
> Thx for fixing this:
>
> Acked-by: Lorenzo Bianconi 

Thanks Ilya and Dumitru for reporting and fixing this.

I applied this patch to master.

Numan


>
> >
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 57df62b92..9839b8c4f 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -11240,20 +11240,6 @@ build_lrouter_in_dnat_flow(struct hmap *lflows, 
> > struct ovn_datapath *od,
> >  &nat->header_);
> >  }
> >  }
> > -
> > -if (!od->l3dgw_port) {
> > -/* For gateway router, re-circulate every packet through
> > -* the DNAT zone.  This helps with the following.
> > -*
> > -* Any packet that needs to be unDNATed in the reverse
> > -* direction gets unDNATed. Ideally this could be done in
> > -* the egress pipeline. But since the gateway router
> > -* does not have any feature that depends on the source
> > -* ip address being external IP address for IP routing,
> > -* we can do it here, saving a future re-circulation. */
> > -ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
> > -  "ip", "flags.loopback = 1; ct_dnat;");
> > -}
> >  }
> >
> >  static void
> > @@ -11716,6 +11702,18 @@ build_lrouter_nat_defrag_and_lb(struct 
> > ovn_datapath *od,
> >  od->lb_force_snat_addrs.ipv6_addrs[0].addr_s, "lb");
> >  }
> >  }
> > +
> > +/* For gateway router, re-circulate every packet through
> > + * the DNAT zone.  This helps with the following.
> > + *
> > + * Any packet that needs to be unDNATed in the reverse
> > + * direction gets unDNATed. Ideally this could be done in
> > + * the egress pipeline. But since the gateway router
> > + * does not have any feature that depends on the source
> > + * ip address being external IP address for IP routing,
> > + * we can do it here, saving a future re-circulation. */
> > +ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
> > +  "ip", "flags.loopback = 1; ct_dnat;");
> >  }
> >
> >  /* Load balancing and packet defrag are only valid on
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 47f2662e4..96476497d 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -2700,7 +2700,7 @@ wait_row_count nb:Logical_Switch_Port 1 up=false 
> > name=lsp1
> >  AT_CLEANUP
> >
> >  OVN_FOR_EACH_NORTHD([
> > -AT_SETUP([ovn -- lb_force_snat_ip for Gateway Routers])
> > +AT_SETUP([ovn -- Load Balancers and lb_force_snat_ip for Gateway Routers])
> >  ovn_start
> >
> >  check ovn-nbctl ls-add sw0
> > @@ -2740,11 +2740,11 @@ AT_CHECK([grep "lr_in_unsnat" lr0flows | sort], 
> > [0], [dnl
> >table=5 (lr_in_unsnat   ), priority=0, match=(1), action=(next;)
> >  ])
> >
> > -AT_CHECK([grep "lr_in_dnat" lr0flows | grep force_snat_for_lb | sort], 
> > [0], [dnl
> > -])
> > -
> > -
> > -AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], 
> > [0], [dnl
> > +AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl
> > +  table=6 (lr_in_dnat ), priority=0, match=(1), action=(next;)
> > +  table=6 (lr_in_dnat ), priority=120  , match=(ct.est && ip && 
> > ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), action=(ct_dnat;)
> > +  table=6 (lr_in_dnat ), priority=120  , match=(ct.new && ip && 
> > ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), 
> > action=(ct_lb(backends=10.0.0.4:8080);)
> > +  table=6 (lr_in_dnat ), priority=50   , match=(ip), 
> > action=(flags.loopback = 1; ct_dnat;)
> >  ])
> >
> >  check ovn-nbctl --wait=sb set logical_router lr0 
> > options:lb_force_snat_ip="20.0.0.4 aef0::4"
> > @@ -2759,14 +2759,18 @@ AT_CHECK([grep "lr_in_unsnat" lr0flows | sort], 
> > [0], [dnl
> >table=5 (lr_in_unsnat   ), priority=110  , match=(ip6 && ip6.dst == 
> > aef0::4), action=(ct_snat;)
> >  ])
> >
> > -AT_CHECK([grep "lr_in_dnat" lr0flows | grep force_snat_for_lb | sort], 
> > [0], [dnl
> > +AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl
> > +  table=6 (lr_in_dnat ), priority=0, match=(1), action=(next;)
> >table=6 (lr_in_dnat ), priority=120  , match=(ct.est && ip && 
> > ip4.dst == 10.0.0.10 && t

Re: [ovs-dev] [PATCH ovn] northd: Restore flows that recirculate packets in the router DNAT zone.

2021-04-01 Thread Lorenzo Bianconi
>
> Also improve the tests to make sure this doesn't break again in the
> future.
>
> Fixes: 225426081f85 ("northd: introduce build_lrouter_in_dnat_flow routine")
> CC: Lorenzo Bianconi 
> Reported-by: Ilya Maximets 
> Signed-off-by: Dumitru Ceara 
> ---
> Note: the regression was only in the ovn-northd C implementation, there
> are no ovn-northd-ddlog changes required.
> ---
>  northd/ovn-northd.c | 26 --
>  tests/ovn-northd.at | 40 +++-
>  2 files changed, 39 insertions(+), 27 deletions(-)

Thx for fixing this:

Acked-by: Lorenzo Bianconi 

>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 57df62b92..9839b8c4f 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -11240,20 +11240,6 @@ build_lrouter_in_dnat_flow(struct hmap *lflows, 
> struct ovn_datapath *od,
>  &nat->header_);
>  }
>  }
> -
> -if (!od->l3dgw_port) {
> -/* For gateway router, re-circulate every packet through
> -* the DNAT zone.  This helps with the following.
> -*
> -* Any packet that needs to be unDNATed in the reverse
> -* direction gets unDNATed. Ideally this could be done in
> -* the egress pipeline. But since the gateway router
> -* does not have any feature that depends on the source
> -* ip address being external IP address for IP routing,
> -* we can do it here, saving a future re-circulation. */
> -ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
> -  "ip", "flags.loopback = 1; ct_dnat;");
> -}
>  }
>
>  static void
> @@ -11716,6 +11702,18 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath 
> *od,
>  od->lb_force_snat_addrs.ipv6_addrs[0].addr_s, "lb");
>  }
>  }
> +
> +/* For gateway router, re-circulate every packet through
> + * the DNAT zone.  This helps with the following.
> + *
> + * Any packet that needs to be unDNATed in the reverse
> + * direction gets unDNATed. Ideally this could be done in
> + * the egress pipeline. But since the gateway router
> + * does not have any feature that depends on the source
> + * ip address being external IP address for IP routing,
> + * we can do it here, saving a future re-circulation. */
> +ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
> +  "ip", "flags.loopback = 1; ct_dnat;");
>  }
>
>  /* Load balancing and packet defrag are only valid on
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 47f2662e4..96476497d 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -2700,7 +2700,7 @@ wait_row_count nb:Logical_Switch_Port 1 up=false 
> name=lsp1
>  AT_CLEANUP
>
>  OVN_FOR_EACH_NORTHD([
> -AT_SETUP([ovn -- lb_force_snat_ip for Gateway Routers])
> +AT_SETUP([ovn -- Load Balancers and lb_force_snat_ip for Gateway Routers])
>  ovn_start
>
>  check ovn-nbctl ls-add sw0
> @@ -2740,11 +2740,11 @@ AT_CHECK([grep "lr_in_unsnat" lr0flows | sort], [0], 
> [dnl
>table=5 (lr_in_unsnat   ), priority=0, match=(1), action=(next;)
>  ])
>
> -AT_CHECK([grep "lr_in_dnat" lr0flows | grep force_snat_for_lb | sort], [0], 
> [dnl
> -])
> -
> -
> -AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], [0], 
> [dnl
> +AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl
> +  table=6 (lr_in_dnat ), priority=0, match=(1), action=(next;)
> +  table=6 (lr_in_dnat ), priority=120  , match=(ct.est && ip && 
> ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), action=(ct_dnat;)
> +  table=6 (lr_in_dnat ), priority=120  , match=(ct.new && ip && 
> ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), 
> action=(ct_lb(backends=10.0.0.4:8080);)
> +  table=6 (lr_in_dnat ), priority=50   , match=(ip), 
> action=(flags.loopback = 1; ct_dnat;)
>  ])
>
>  check ovn-nbctl --wait=sb set logical_router lr0 
> options:lb_force_snat_ip="20.0.0.4 aef0::4"
> @@ -2759,14 +2759,18 @@ AT_CHECK([grep "lr_in_unsnat" lr0flows | sort], [0], 
> [dnl
>table=5 (lr_in_unsnat   ), priority=110  , match=(ip6 && ip6.dst == 
> aef0::4), action=(ct_snat;)
>  ])
>
> -AT_CHECK([grep "lr_in_dnat" lr0flows | grep force_snat_for_lb | sort], [0], 
> [dnl
> +AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl
> +  table=6 (lr_in_dnat ), priority=0, match=(1), action=(next;)
>table=6 (lr_in_dnat ), priority=120  , match=(ct.est && ip && 
> ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), 
> action=(flags.force_snat_for_lb = 1; ct_dnat;)
>table=6 (lr_in_dnat ), priority=120  , match=(ct.new && ip && 
> ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), 
> action=(flags.force_snat_for_lb = 1; ct_lb(backends=10.0.0.4:8080);)
> +  table=6 (lr_in_dnat ), priority=50   , match=(ip), 
> action=(flags.loopback = 1; ct_dnat;)
>  ])
>
> -AT_C

Re: [ovs-dev] [PATCH v4 1/2] Encap & Decap actions for MPLS packet type.

2021-04-01 Thread Eelco Chaudron



On 1 Apr 2021, at 11:28, Martin Varghese wrote:


On Thu, Apr 01, 2021 at 11:17:14AM +0200, Eelco Chaudron wrote:



On 1 Apr 2021, at 11:09, Martin Varghese wrote:


On Thu, Apr 01, 2021 at 10:54:42AM +0200, Eelco Chaudron wrote:



On 1 Apr 2021, at 10:35, Martin Varghese wrote:


On Thu, Apr 01, 2021 at 08:59:27AM +0200, Eelco Chaudron wrote:



On 1 Apr 2021, at 6:10, Martin Varghese wrote:


On Wed, Mar 31, 2021 at 03:59:40PM +0200, Eelco Chaudron wrote:



On 26 Mar 2021, at 7:21, Martin Varghese wrote:


From: Martin Varghese 

The encap & decap actions are extended to support MPLS
packet type.
Encap & decap actions adds and removes MPLS
header at start of the
packet.


Hi Martin,

I’m trying to do some real-life testing, and I’m running 
into

issues. This
might be me setting it up wrongly but just wanting to 
confirm…


I’m sending an MPLS packet that contains an ARP packet into a
physical port.
This is the packet:

Frame 4: 64 bytes on wire (512 bits), 64 bytes
captured (512 bits)
Encapsulation type: Ethernet (1)
[Protocols in frame: eth:ethertype:mpls:data]
Ethernet II, Src: 00:00:00_00:00:01 (00:00:00:00:00:01), Dst:
00:00:00_00:00:02 (00:00:00:00:00:02)
Destination: 00:00:00_00:00:02 (00:00:00:00:00:02)
Address: 00:00:00_00:00:02 (00:00:00:00:00:02)
 ..0.     = LG bit: Globally unique
address
(factory default)
 ...0     = IG bit:
Individual address
(unicast)
Source: 00:00:00_00:00:01 (00:00:00:00:00:01)
Address: 00:00:00_00:00:01 (00:00:00:00:00:01)
 ..0.     = LG bit: Globally unique
address
(factory default)
 ...0     = IG bit:
Individual address
(unicast)
Type: MPLS label switched packet (0x8847)
MultiProtocol Label Switching Header, Label: 100, Exp: 0, S:
1, TTL:
64
   0110 0100    = MPLS Label: 100
     000.   = MPLS Experimental
Bits: 0
     ...1   = MPLS
Bottom Of Label
Stack: 1
      0100  = MPLS TTL: 64
Data (46 bytes)

  ff ff ff ff ff ff 52 54 00 88 51 38 08 06 00 01
..RT..Q8
0010  08 00 06 04 00 01 52 54 00 88 51 38 01 01 01 65
..RT..Q8...e
0020  00 00 00 00 00 00 01 01 01 64 27 98 a0 47
.d'..G
Data:
5254008851380806000108000604000152540088513801010165?


I’m trying to use the following rules:

  ovs-ofctl del-flows ovs_pvp_br0
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
"priority=100,dl_type=0x8847,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)"
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
"table=3,priority=10
actions=normal"

With these, I expect the packet to be sent to vnet0, but
it’s not.
Actually,
the datapath rule looks odd, while the userspace rules seem
to match:

  $ ovs-dpctl dump-flows
  
recirc_id(0),in_port(1),eth(),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=0/0x0,bos=1/1),
packets:13, bytes:1118, used:0.322s,
actions:pop_eth,pop_mpls(eth_type=0x806),recirc(0x19a)
  recirc_id(0x19a),in_port(1),eth_type(0x0806), packets:13,
bytes:884,
used:0.322s, actions:drop

  $ ovs-ofctl dump-flows ovs_pvp_br0 -O OpenFlow13
  cookie=0x0, duration=85.007s, table=0, n_packets=51,
n_bytes=4386,
priority=100,mpls,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)
  cookie=0x0, duration=84.990s, table=3, n_packets=51,
n_bytes=3468,
priority=10 actions=NORMAL


The inner packet is ethernet. So the packet type should be
(ns=0,type=0)
?


Forgot to add that I already tried that to start with, based on 
the

example,
but as that did not work I tried 0x806.

PS: I have this as a remark in my review notes, i.e., to
explain the
ns and
type usage here.


This resulted in packets being counted at the open flow
level, but it
results in NO data path rules. Do get an error though:

2021-04-01T06:53:36.056Z|00141|dpif(handler37)|WARN|system@ovs-system:
failed to put[create] (Invalid argument)
ufid:3d2d6f6d-5a66-4ace-8b09-7cdcfa5efc8e 
recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:01/00:00:00:00:00:00,dst=00:00:00:00:00:02/00:00:00:00:00:00),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=64/0x0,bos=1/1),

actions:pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)


This set(eth) before the recirc is the problem i guesss. I need
to check

2021-04-01T06:53:36.056Z|00142|dpif(handler37)|WARN|system@ovs-system:
execute pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)
failed
(Invalid argument) on packet 
mpls,vlan_tci=0x,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,mpls_label=100,mpls_tc=0,mpls_ttl=64,mpls_bos=1

 with metadata skb_priority(0),skb_mark(0),in_port(1) mtu 0

Are there missing parts in my kernel that do not get properly
detected by
the feature detection?

$ ovs-appctl dpif/s

Re: [ovs-dev] [PATCH v4 1/2] Encap & Decap actions for MPLS packet type.

2021-04-01 Thread Martin Varghese
On Thu, Apr 01, 2021 at 11:17:14AM +0200, Eelco Chaudron wrote:
> 
> 
> On 1 Apr 2021, at 11:09, Martin Varghese wrote:
> 
> > On Thu, Apr 01, 2021 at 10:54:42AM +0200, Eelco Chaudron wrote:
> > > 
> > > 
> > > On 1 Apr 2021, at 10:35, Martin Varghese wrote:
> > > 
> > > > On Thu, Apr 01, 2021 at 08:59:27AM +0200, Eelco Chaudron wrote:
> > > > > 
> > > > > 
> > > > > On 1 Apr 2021, at 6:10, Martin Varghese wrote:
> > > > > 
> > > > > > On Wed, Mar 31, 2021 at 03:59:40PM +0200, Eelco Chaudron wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 26 Mar 2021, at 7:21, Martin Varghese wrote:
> > > > > > > 
> > > > > > > > From: Martin Varghese 
> > > > > > > > 
> > > > > > > > The encap & decap actions are extended to support MPLS
> > > > > > > > packet type.
> > > > > > > > Encap & decap actions adds and removes MPLS
> > > > > > > > header at start of the
> > > > > > > > packet.
> > > > > > > 
> > > > > > > Hi Martin,
> > > > > > > 
> > > > > > > I’m trying to do some real-life testing, and I’m running into
> > > > > > > issues. This
> > > > > > > might be me setting it up wrongly but just wanting to confirm…
> > > > > > > 
> > > > > > > I’m sending an MPLS packet that contains an ARP packet into a
> > > > > > > physical port.
> > > > > > > This is the packet:
> > > > > > > 
> > > > > > > Frame 4: 64 bytes on wire (512 bits), 64 bytes
> > > > > > > captured (512 bits)
> > > > > > > Encapsulation type: Ethernet (1)
> > > > > > > [Protocols in frame: eth:ethertype:mpls:data]
> > > > > > > Ethernet II, Src: 00:00:00_00:00:01 (00:00:00:00:00:01), Dst:
> > > > > > > 00:00:00_00:00:02 (00:00:00:00:00:02)
> > > > > > > Destination: 00:00:00_00:00:02 (00:00:00:00:00:02)
> > > > > > > Address: 00:00:00_00:00:02 (00:00:00:00:00:02)
> > > > > > >  ..0.     = LG bit: Globally unique
> > > > > > > address
> > > > > > > (factory default)
> > > > > > >  ...0     = IG bit:
> > > > > > > Individual address
> > > > > > > (unicast)
> > > > > > > Source: 00:00:00_00:00:01 (00:00:00:00:00:01)
> > > > > > > Address: 00:00:00_00:00:01 (00:00:00:00:00:01)
> > > > > > >  ..0.     = LG bit: Globally unique
> > > > > > > address
> > > > > > > (factory default)
> > > > > > >  ...0     = IG bit:
> > > > > > > Individual address
> > > > > > > (unicast)
> > > > > > > Type: MPLS label switched packet (0x8847)
> > > > > > > MultiProtocol Label Switching Header, Label: 100, Exp: 0, S:
> > > > > > > 1, TTL:
> > > > > > > 64
> > > > > > >    0110 0100    = MPLS Label: 100
> > > > > > >      000.   = MPLS Experimental
> > > > > > > Bits: 0
> > > > > > >      ...1   = MPLS
> > > > > > > Bottom Of Label
> > > > > > > Stack: 1
> > > > > > >       0100  = MPLS TTL: 64
> > > > > > > Data (46 bytes)
> > > > > > > 
> > > > > > >   ff ff ff ff ff ff 52 54 00 88 51 38 08 06 00 01
> > > > > > > ..RT..Q8
> > > > > > > 0010  08 00 06 04 00 01 52 54 00 88 51 38 01 01 01 65
> > > > > > > ..RT..Q8...e
> > > > > > > 0020  00 00 00 00 00 00 01 01 01 64 27 98 a0 47
> > > > > > > .d'..G
> > > > > > > Data:
> > > > > > > 5254008851380806000108000604000152540088513801010165?
> > > > > > > 
> > > > > > > 
> > > > > > > I’m trying to use the following rules:
> > > > > > > 
> > > > > > >   ovs-ofctl del-flows ovs_pvp_br0
> > > > > > >   ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
> > > > > > > "priority=100,dl_type=0x8847,mpls_label=100
> > > > > > > actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)"
> > > > > > >   ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
> > > > > > > "table=3,priority=10
> > > > > > > actions=normal"
> > > > > > > 
> > > > > > > With these, I expect the packet to be sent to vnet0, but
> > > > > > > it’s not.
> > > > > > > Actually,
> > > > > > > the datapath rule looks odd, while the userspace rules seem
> > > > > > > to match:
> > > > > > > 
> > > > > > >   $ ovs-dpctl dump-flows
> > > > > > >   
> > > > > > > recirc_id(0),in_port(1),eth(),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=0/0x0,bos=1/1),
> > > > > > > packets:13, bytes:1118, used:0.322s,
> > > > > > > actions:pop_eth,pop_mpls(eth_type=0x806),recirc(0x19a)
> > > > > > >   recirc_id(0x19a),in_port(1),eth_type(0x0806), packets:13,
> > > > > > > bytes:884,
> > > > > > > used:0.322s, actions:drop
> > > > > > > 
> > > > > > >   $ ovs-ofctl dump-flows ovs_pvp_br0 -O OpenFlow13
> > > > > > >   cookie=0x0, duration=85.007s, table=0, n_packets=51,
> > > > > > > n_bytes=4386,
> > > > > > > priority=100,mpls,mpls_label=100
> > > > > > > actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)
> > > > > > >   cookie=0x0, duration=84.990s, table=3, n_packets=51,
> > > > > > > n_bytes=3468,
> > > > > > > priority=10 actions=NORMAL
> > > > > > > 
> > > >

Re: [ovs-dev] [PATCH v4 1/2] Encap & Decap actions for MPLS packet type.

2021-04-01 Thread Eelco Chaudron



On 1 Apr 2021, at 11:17, Eelco Chaudron wrote:


On 1 Apr 2021, at 11:09, Martin Varghese wrote:


On Thu, Apr 01, 2021 at 10:54:42AM +0200, Eelco Chaudron wrote:



On 1 Apr 2021, at 10:35, Martin Varghese wrote:


On Thu, Apr 01, 2021 at 08:59:27AM +0200, Eelco Chaudron wrote:



On 1 Apr 2021, at 6:10, Martin Varghese wrote:


On Wed, Mar 31, 2021 at 03:59:40PM +0200, Eelco Chaudron wrote:



On 26 Mar 2021, at 7:21, Martin Varghese wrote:


From: Martin Varghese 

The encap & decap actions are extended to support MPLS
packet type.
Encap & decap actions adds and removes MPLS header at start of 
the

packet.


Hi Martin,

I’m trying to do some real-life testing, and I’m running 
into

issues. This
might be me setting it up wrongly but just wanting to confirm…

I’m sending an MPLS packet that contains an ARP packet into a
physical port.
This is the packet:

Frame 4: 64 bytes on wire (512 bits), 64 bytes captured (512 
bits)

Encapsulation type: Ethernet (1)
[Protocols in frame: eth:ethertype:mpls:data]
Ethernet II, Src: 00:00:00_00:00:01 (00:00:00:00:00:01), Dst:
00:00:00_00:00:02 (00:00:00:00:00:02)
Destination: 00:00:00_00:00:02 (00:00:00:00:00:02)
Address: 00:00:00_00:00:02 (00:00:00:00:00:02)
 ..0.     = LG bit: Globally unique
address
(factory default)
 ...0     = IG bit: Individual 
address

(unicast)
Source: 00:00:00_00:00:01 (00:00:00:00:00:01)
Address: 00:00:00_00:00:01 (00:00:00:00:00:01)
 ..0.     = LG bit: Globally unique
address
(factory default)
 ...0     = IG bit: Individual 
address

(unicast)
Type: MPLS label switched packet (0x8847)
MultiProtocol Label Switching Header, Label: 100, Exp: 0, S:
1, TTL:
64
   0110 0100    = MPLS Label: 100
     000.   = MPLS Experimental
Bits: 0
     ...1   = MPLS Bottom Of 
Label

Stack: 1
      0100  = MPLS TTL: 64
Data (46 bytes)

  ff ff ff ff ff ff 52 54 00 88 51 38 08 06 00 01
..RT..Q8
0010  08 00 06 04 00 01 52 54 00 88 51 38 01 01 01 65
..RT..Q8...e
0020  00 00 00 00 00 00 01 01 01 64 27 98 a0 47
.d'..G
Data:
5254008851380806000108000604000152540088513801010165?


I’m trying to use the following rules:

  ovs-ofctl del-flows ovs_pvp_br0
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
"priority=100,dl_type=0x8847,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)"
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0 
"table=3,priority=10

actions=normal"

With these, I expect the packet to be sent to vnet0, but
it’s not.
Actually,
the datapath rule looks odd, while the userspace rules seem
to match:

  $ ovs-dpctl dump-flows
  
recirc_id(0),in_port(1),eth(),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=0/0x0,bos=1/1),
packets:13, bytes:1118, used:0.322s,
actions:pop_eth,pop_mpls(eth_type=0x806),recirc(0x19a)
  recirc_id(0x19a),in_port(1),eth_type(0x0806), packets:13,
bytes:884,
used:0.322s, actions:drop

  $ ovs-ofctl dump-flows ovs_pvp_br0 -O OpenFlow13
  cookie=0x0, duration=85.007s, table=0, n_packets=51,
n_bytes=4386,
priority=100,mpls,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)
  cookie=0x0, duration=84.990s, table=3, n_packets=51,
n_bytes=3468,
priority=10 actions=NORMAL


The inner packet is ethernet. So the packet type should be
(ns=0,type=0)
?


Forgot to add that I already tried that to start with, based on 
the

example,
but as that did not work I tried 0x806.

PS: I have this as a remark in my review notes, i.e., to explain 
the

ns and
type usage here.


This resulted in packets being counted at the open flow level, but 
it

results in NO data path rules. Do get an error though:

2021-04-01T06:53:36.056Z|00141|dpif(handler37)|WARN|system@ovs-system:
failed to put[create] (Invalid argument)
ufid:3d2d6f6d-5a66-4ace-8b09-7cdcfa5efc8e 
recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:01/00:00:00:00:00:00,dst=00:00:00:00:00:02/00:00:00:00:00:00),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=64/0x0,bos=1/1),

actions:pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)


This set(eth) before the recirc is the problem i guesss. I need to 
check

2021-04-01T06:53:36.056Z|00142|dpif(handler37)|WARN|system@ovs-system:
execute pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)
failed
(Invalid argument) on packet 
mpls,vlan_tci=0x,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,mpls_label=100,mpls_tc=0,mpls_ttl=64,mpls_bos=1

 with metadata skb_priority(0),skb_mark(0),in_port(1) mtu 0

Are there missing parts in my kernel that do not get properly
detected by
the feature detection?

$ ovs-appctl dpif/show-dp-features ovs_pvp_br0
Masked set action: Yes
Tu

[ovs-dev] [PATCH ovn] northd: Restore flows that recirculate packets in the router DNAT zone.

2021-04-01 Thread Dumitru Ceara
Also improve the tests to make sure this doesn't break again in the
future.

Fixes: 225426081f85 ("northd: introduce build_lrouter_in_dnat_flow routine")
CC: Lorenzo Bianconi 
Reported-by: Ilya Maximets 
Signed-off-by: Dumitru Ceara 
---
Note: the regression was only in the ovn-northd C implementation, there
are no ovn-northd-ddlog changes required.
---
 northd/ovn-northd.c | 26 --
 tests/ovn-northd.at | 40 +++-
 2 files changed, 39 insertions(+), 27 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 57df62b92..9839b8c4f 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -11240,20 +11240,6 @@ build_lrouter_in_dnat_flow(struct hmap *lflows, struct 
ovn_datapath *od,
 &nat->header_);
 }
 }
-
-if (!od->l3dgw_port) {
-/* For gateway router, re-circulate every packet through
-* the DNAT zone.  This helps with the following.
-*
-* Any packet that needs to be unDNATed in the reverse
-* direction gets unDNATed. Ideally this could be done in
-* the egress pipeline. But since the gateway router
-* does not have any feature that depends on the source
-* ip address being external IP address for IP routing,
-* we can do it here, saving a future re-circulation. */
-ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
-  "ip", "flags.loopback = 1; ct_dnat;");
-}
 }
 
 static void
@@ -11716,6 +11702,18 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath 
*od,
 od->lb_force_snat_addrs.ipv6_addrs[0].addr_s, "lb");
 }
 }
+
+/* For gateway router, re-circulate every packet through
+ * the DNAT zone.  This helps with the following.
+ *
+ * Any packet that needs to be unDNATed in the reverse
+ * direction gets unDNATed. Ideally this could be done in
+ * the egress pipeline. But since the gateway router
+ * does not have any feature that depends on the source
+ * ip address being external IP address for IP routing,
+ * we can do it here, saving a future re-circulation. */
+ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
+  "ip", "flags.loopback = 1; ct_dnat;");
 }
 
 /* Load balancing and packet defrag are only valid on
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 47f2662e4..96476497d 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2700,7 +2700,7 @@ wait_row_count nb:Logical_Switch_Port 1 up=false name=lsp1
 AT_CLEANUP
 
 OVN_FOR_EACH_NORTHD([
-AT_SETUP([ovn -- lb_force_snat_ip for Gateway Routers])
+AT_SETUP([ovn -- Load Balancers and lb_force_snat_ip for Gateway Routers])
 ovn_start
 
 check ovn-nbctl ls-add sw0
@@ -2740,11 +2740,11 @@ AT_CHECK([grep "lr_in_unsnat" lr0flows | sort], [0], 
[dnl
   table=5 (lr_in_unsnat   ), priority=0, match=(1), action=(next;)
 ])
 
-AT_CHECK([grep "lr_in_dnat" lr0flows | grep force_snat_for_lb | sort], [0], 
[dnl
-])
-
-
-AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], [0], 
[dnl
+AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl
+  table=6 (lr_in_dnat ), priority=0, match=(1), action=(next;)
+  table=6 (lr_in_dnat ), priority=120  , match=(ct.est && ip && 
ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), action=(ct_dnat;)
+  table=6 (lr_in_dnat ), priority=120  , match=(ct.new && ip && 
ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), 
action=(ct_lb(backends=10.0.0.4:8080);)
+  table=6 (lr_in_dnat ), priority=50   , match=(ip), 
action=(flags.loopback = 1; ct_dnat;)
 ])
 
 check ovn-nbctl --wait=sb set logical_router lr0 
options:lb_force_snat_ip="20.0.0.4 aef0::4"
@@ -2759,14 +2759,18 @@ AT_CHECK([grep "lr_in_unsnat" lr0flows | sort], [0], 
[dnl
   table=5 (lr_in_unsnat   ), priority=110  , match=(ip6 && ip6.dst == 
aef0::4), action=(ct_snat;)
 ])
 
-AT_CHECK([grep "lr_in_dnat" lr0flows | grep force_snat_for_lb | sort], [0], 
[dnl
+AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl
+  table=6 (lr_in_dnat ), priority=0, match=(1), action=(next;)
   table=6 (lr_in_dnat ), priority=120  , match=(ct.est && ip && 
ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), action=(flags.force_snat_for_lb 
= 1; ct_dnat;)
   table=6 (lr_in_dnat ), priority=120  , match=(ct.new && ip && 
ip4.dst == 10.0.0.10 && tcp && tcp.dst == 80), action=(flags.force_snat_for_lb 
= 1; ct_lb(backends=10.0.0.4:8080);)
+  table=6 (lr_in_dnat ), priority=50   , match=(ip), 
action=(flags.loopback = 1; ct_dnat;)
 ])
 
-AT_CHECK([grep "lr_out_snat" lr0flows | grep force_snat_for_lb | sort], [0], 
[dnl
+AT_CHECK([grep "lr_out_snat" lr0flows | sort], [0], [dnl
+  table=1 (lr_out_snat), priority=0, match=(1), action=(next;)
   table=1 (lr_out_snat), priority=100  , 
match=(flags.fo

Re: [ovs-dev] [PATCH v4 1/2] Encap & Decap actions for MPLS packet type.

2021-04-01 Thread Eelco Chaudron



On 1 Apr 2021, at 11:09, Martin Varghese wrote:


On Thu, Apr 01, 2021 at 10:54:42AM +0200, Eelco Chaudron wrote:



On 1 Apr 2021, at 10:35, Martin Varghese wrote:


On Thu, Apr 01, 2021 at 08:59:27AM +0200, Eelco Chaudron wrote:



On 1 Apr 2021, at 6:10, Martin Varghese wrote:


On Wed, Mar 31, 2021 at 03:59:40PM +0200, Eelco Chaudron wrote:



On 26 Mar 2021, at 7:21, Martin Varghese wrote:


From: Martin Varghese 

The encap & decap actions are extended to support MPLS
packet type.
Encap & decap actions adds and removes MPLS header at start of 
the

packet.


Hi Martin,

I’m trying to do some real-life testing, and I’m running into
issues. This
might be me setting it up wrongly but just wanting to confirm…

I’m sending an MPLS packet that contains an ARP packet into a
physical port.
This is the packet:

Frame 4: 64 bytes on wire (512 bits), 64 bytes captured (512 
bits)

Encapsulation type: Ethernet (1)
[Protocols in frame: eth:ethertype:mpls:data]
Ethernet II, Src: 00:00:00_00:00:01 (00:00:00:00:00:01), Dst:
00:00:00_00:00:02 (00:00:00:00:00:02)
Destination: 00:00:00_00:00:02 (00:00:00:00:00:02)
Address: 00:00:00_00:00:02 (00:00:00:00:00:02)
 ..0.     = LG bit: Globally unique
address
(factory default)
 ...0     = IG bit: Individual 
address

(unicast)
Source: 00:00:00_00:00:01 (00:00:00:00:00:01)
Address: 00:00:00_00:00:01 (00:00:00:00:00:01)
 ..0.     = LG bit: Globally unique
address
(factory default)
 ...0     = IG bit: Individual 
address

(unicast)
Type: MPLS label switched packet (0x8847)
MultiProtocol Label Switching Header, Label: 100, Exp: 0, S:
1, TTL:
64
   0110 0100    = MPLS Label: 100
     000.   = MPLS Experimental
Bits: 0
     ...1   = MPLS Bottom Of 
Label

Stack: 1
      0100  = MPLS TTL: 64
Data (46 bytes)

  ff ff ff ff ff ff 52 54 00 88 51 38 08 06 00 01
..RT..Q8
0010  08 00 06 04 00 01 52 54 00 88 51 38 01 01 01 65
..RT..Q8...e
0020  00 00 00 00 00 00 01 01 01 64 27 98 a0 47
.d'..G
Data:
5254008851380806000108000604000152540088513801010165?


I’m trying to use the following rules:

  ovs-ofctl del-flows ovs_pvp_br0
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
"priority=100,dl_type=0x8847,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)"
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0 
"table=3,priority=10

actions=normal"

With these, I expect the packet to be sent to vnet0, but
it’s not.
Actually,
the datapath rule looks odd, while the userspace rules seem
to match:

  $ ovs-dpctl dump-flows
  
recirc_id(0),in_port(1),eth(),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=0/0x0,bos=1/1),
packets:13, bytes:1118, used:0.322s,
actions:pop_eth,pop_mpls(eth_type=0x806),recirc(0x19a)
  recirc_id(0x19a),in_port(1),eth_type(0x0806), packets:13,
bytes:884,
used:0.322s, actions:drop

  $ ovs-ofctl dump-flows ovs_pvp_br0 -O OpenFlow13
  cookie=0x0, duration=85.007s, table=0, n_packets=51,
n_bytes=4386,
priority=100,mpls,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)
  cookie=0x0, duration=84.990s, table=3, n_packets=51,
n_bytes=3468,
priority=10 actions=NORMAL


The inner packet is ethernet. So the packet type should be
(ns=0,type=0)
?


Forgot to add that I already tried that to start with, based on the
example,
but as that did not work I tried 0x806.

PS: I have this as a remark in my review notes, i.e., to explain 
the

ns and
type usage here.


This resulted in packets being counted at the open flow level, but 
it

results in NO data path rules. Do get an error though:

2021-04-01T06:53:36.056Z|00141|dpif(handler37)|WARN|system@ovs-system:
failed to put[create] (Invalid argument)
ufid:3d2d6f6d-5a66-4ace-8b09-7cdcfa5efc8e 
recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:01/00:00:00:00:00:00,dst=00:00:00:00:00:02/00:00:00:00:00:00),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=64/0x0,bos=1/1),

actions:pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)


This set(eth) before the recirc is the problem i guesss. I need to 
check

2021-04-01T06:53:36.056Z|00142|dpif(handler37)|WARN|system@ovs-system:
execute pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)
failed
(Invalid argument) on packet 
mpls,vlan_tci=0x,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,mpls_label=100,mpls_tc=0,mpls_ttl=64,mpls_bos=1

 with metadata skb_priority(0),skb_mark(0),in_port(1) mtu 0

Are there missing parts in my kernel that do not get properly
detected by
the feature detection?

$ ovs-appctl dpif/show-dp-features ovs_pvp_br0
Masked set action: Yes
Tunnel push pop: No
Ufid: Yes
Truncate action: Yes
Clon

Re: [ovs-dev] [PATCH v4 1/2] Encap & Decap actions for MPLS packet type.

2021-04-01 Thread Martin Varghese
On Thu, Apr 01, 2021 at 10:54:42AM +0200, Eelco Chaudron wrote:
> 
> 
> On 1 Apr 2021, at 10:35, Martin Varghese wrote:
> 
> > On Thu, Apr 01, 2021 at 08:59:27AM +0200, Eelco Chaudron wrote:
> > > 
> > > 
> > > On 1 Apr 2021, at 6:10, Martin Varghese wrote:
> > > 
> > > > On Wed, Mar 31, 2021 at 03:59:40PM +0200, Eelco Chaudron wrote:
> > > > > 
> > > > > 
> > > > > On 26 Mar 2021, at 7:21, Martin Varghese wrote:
> > > > > 
> > > > > > From: Martin Varghese 
> > > > > > 
> > > > > > The encap & decap actions are extended to support MPLS
> > > > > > packet type.
> > > > > > Encap & decap actions adds and removes MPLS header at start of the
> > > > > > packet.
> > > > > 
> > > > > Hi Martin,
> > > > > 
> > > > > I’m trying to do some real-life testing, and I’m running into
> > > > > issues. This
> > > > > might be me setting it up wrongly but just wanting to confirm…
> > > > > 
> > > > > I’m sending an MPLS packet that contains an ARP packet into a
> > > > > physical port.
> > > > > This is the packet:
> > > > > 
> > > > > Frame 4: 64 bytes on wire (512 bits), 64 bytes captured (512 bits)
> > > > > Encapsulation type: Ethernet (1)
> > > > > [Protocols in frame: eth:ethertype:mpls:data]
> > > > > Ethernet II, Src: 00:00:00_00:00:01 (00:00:00:00:00:01), Dst:
> > > > > 00:00:00_00:00:02 (00:00:00:00:00:02)
> > > > > Destination: 00:00:00_00:00:02 (00:00:00:00:00:02)
> > > > > Address: 00:00:00_00:00:02 (00:00:00:00:00:02)
> > > > >  ..0.     = LG bit: Globally unique
> > > > > address
> > > > > (factory default)
> > > > >  ...0     = IG bit: Individual address
> > > > > (unicast)
> > > > > Source: 00:00:00_00:00:01 (00:00:00:00:00:01)
> > > > > Address: 00:00:00_00:00:01 (00:00:00:00:00:01)
> > > > >  ..0.     = LG bit: Globally unique
> > > > > address
> > > > > (factory default)
> > > > >  ...0     = IG bit: Individual address
> > > > > (unicast)
> > > > > Type: MPLS label switched packet (0x8847)
> > > > > MultiProtocol Label Switching Header, Label: 100, Exp: 0, S:
> > > > > 1, TTL:
> > > > > 64
> > > > >    0110 0100    = MPLS Label: 100
> > > > >      000.   = MPLS Experimental
> > > > > Bits: 0
> > > > >      ...1   = MPLS Bottom Of Label
> > > > > Stack: 1
> > > > >       0100  = MPLS TTL: 64
> > > > > Data (46 bytes)
> > > > > 
> > > > >   ff ff ff ff ff ff 52 54 00 88 51 38 08 06 00 01
> > > > > ..RT..Q8
> > > > > 0010  08 00 06 04 00 01 52 54 00 88 51 38 01 01 01 65
> > > > > ..RT..Q8...e
> > > > > 0020  00 00 00 00 00 00 01 01 01 64 27 98 a0 47
> > > > > .d'..G
> > > > > Data:
> > > > > 5254008851380806000108000604000152540088513801010165?
> > > > > 
> > > > > 
> > > > > I’m trying to use the following rules:
> > > > > 
> > > > >   ovs-ofctl del-flows ovs_pvp_br0
> > > > >   ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
> > > > > "priority=100,dl_type=0x8847,mpls_label=100
> > > > > actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)"
> > > > >   ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0 "table=3,priority=10
> > > > > actions=normal"
> > > > > 
> > > > > With these, I expect the packet to be sent to vnet0, but
> > > > > it’s not.
> > > > > Actually,
> > > > > the datapath rule looks odd, while the userspace rules seem
> > > > > to match:
> > > > > 
> > > > >   $ ovs-dpctl dump-flows
> > > > >   
> > > > > recirc_id(0),in_port(1),eth(),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=0/0x0,bos=1/1),
> > > > > packets:13, bytes:1118, used:0.322s,
> > > > > actions:pop_eth,pop_mpls(eth_type=0x806),recirc(0x19a)
> > > > >   recirc_id(0x19a),in_port(1),eth_type(0x0806), packets:13,
> > > > > bytes:884,
> > > > > used:0.322s, actions:drop
> > > > > 
> > > > >   $ ovs-ofctl dump-flows ovs_pvp_br0 -O OpenFlow13
> > > > >   cookie=0x0, duration=85.007s, table=0, n_packets=51,
> > > > > n_bytes=4386,
> > > > > priority=100,mpls,mpls_label=100
> > > > > actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)
> > > > >   cookie=0x0, duration=84.990s, table=3, n_packets=51,
> > > > > n_bytes=3468,
> > > > > priority=10 actions=NORMAL
> > > > > 
> > > > The inner packet is ethernet. So the packet type should be
> > > > (ns=0,type=0)
> > > > ?
> > > 
> > > Forgot to add that I already tried that to start with, based on the
> > > example,
> > > but as that did not work I tried 0x806.
> > > 
> > > PS: I have this as a remark in my review notes, i.e., to explain the
> > > ns and
> > > type usage here.
> > > 
> > > 
> > > This resulted in packets being counted at the open flow level, but it
> > > results in NO data path rules. Do get an error though:
> > > 
> > > 2021-04-01T06:53:36.056Z|00141|dpif(handler37)|WARN|system@ovs-system:
> > > failed to put[create] (Invalid argument)

Re: [ovs-dev] [PATCH v4 1/2] Encap & Decap actions for MPLS packet type.

2021-04-01 Thread Eelco Chaudron



On 1 Apr 2021, at 10:35, Martin Varghese wrote:


On Thu, Apr 01, 2021 at 08:59:27AM +0200, Eelco Chaudron wrote:



On 1 Apr 2021, at 6:10, Martin Varghese wrote:


On Wed, Mar 31, 2021 at 03:59:40PM +0200, Eelco Chaudron wrote:



On 26 Mar 2021, at 7:21, Martin Varghese wrote:


From: Martin Varghese 

The encap & decap actions are extended to support MPLS packet 
type.

Encap & decap actions adds and removes MPLS header at start of the
packet.


Hi Martin,

I’m trying to do some real-life testing, and I’m running into
issues. This
might be me setting it up wrongly but just wanting to confirm…

I’m sending an MPLS packet that contains an ARP packet into a
physical port.
This is the packet:

Frame 4: 64 bytes on wire (512 bits), 64 bytes captured (512 bits)
Encapsulation type: Ethernet (1)
[Protocols in frame: eth:ethertype:mpls:data]
Ethernet II, Src: 00:00:00_00:00:01 (00:00:00:00:00:01), Dst:
00:00:00_00:00:02 (00:00:00:00:00:02)
Destination: 00:00:00_00:00:02 (00:00:00:00:00:02)
Address: 00:00:00_00:00:02 (00:00:00:00:00:02)
 ..0.     = LG bit: Globally unique
address
(factory default)
 ...0     = IG bit: Individual address
(unicast)
Source: 00:00:00_00:00:01 (00:00:00:00:00:01)
Address: 00:00:00_00:00:01 (00:00:00:00:00:01)
 ..0.     = LG bit: Globally unique
address
(factory default)
 ...0     = IG bit: Individual address
(unicast)
Type: MPLS label switched packet (0x8847)
MultiProtocol Label Switching Header, Label: 100, Exp: 0, S: 1, 
TTL:

64
   0110 0100    = MPLS Label: 100
     000.   = MPLS Experimental
Bits: 0
     ...1   = MPLS Bottom Of Label
Stack: 1
      0100  = MPLS TTL: 64
Data (46 bytes)

  ff ff ff ff ff ff 52 54 00 88 51 38 08 06 00 01
..RT..Q8
0010  08 00 06 04 00 01 52 54 00 88 51 38 01 01 01 65
..RT..Q8...e
0020  00 00 00 00 00 00 01 01 01 64 27 98 a0 47
.d'..G
Data:
5254008851380806000108000604000152540088513801010165?


I’m trying to use the following rules:

  ovs-ofctl del-flows ovs_pvp_br0
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
"priority=100,dl_type=0x8847,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)"
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0 "table=3,priority=10
actions=normal"

With these, I expect the packet to be sent to vnet0, but it’s 
not.

Actually,
the datapath rule looks odd, while the userspace rules seem to 
match:


  $ ovs-dpctl dump-flows
  
recirc_id(0),in_port(1),eth(),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=0/0x0,bos=1/1),
packets:13, bytes:1118, used:0.322s,
actions:pop_eth,pop_mpls(eth_type=0x806),recirc(0x19a)
  recirc_id(0x19a),in_port(1),eth_type(0x0806), packets:13,
bytes:884,
used:0.322s, actions:drop

  $ ovs-ofctl dump-flows ovs_pvp_br0 -O OpenFlow13
  cookie=0x0, duration=85.007s, table=0, n_packets=51, 
n_bytes=4386,

priority=100,mpls,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)
  cookie=0x0, duration=84.990s, table=3, n_packets=51, 
n_bytes=3468,

priority=10 actions=NORMAL

The inner packet is ethernet. So the packet type should be 
(ns=0,type=0)

?


Forgot to add that I already tried that to start with, based on the 
example,

but as that did not work I tried 0x806.

PS: I have this as a remark in my review notes, i.e., to explain the 
ns and

type usage here.


This resulted in packets being counted at the open flow level, but it
results in NO data path rules. Do get an error though:

2021-04-01T06:53:36.056Z|00141|dpif(handler37)|WARN|system@ovs-system:
failed to put[create] (Invalid argument)
ufid:3d2d6f6d-5a66-4ace-8b09-7cdcfa5efc8e 
recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:01/00:00:00:00:00:00,dst=00:00:00:00:00:02/00:00:00:00:00:00),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=64/0x0,bos=1/1),

actions:pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)


This set(eth) before the recirc is the problem i guesss. I need to 
check

2021-04-01T06:53:36.056Z|00142|dpif(handler37)|WARN|system@ovs-system:
execute pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c) 
failed
(Invalid argument) on packet 
mpls,vlan_tci=0x,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,mpls_label=100,mpls_tc=0,mpls_ttl=64,mpls_bos=1

 with metadata skb_priority(0),skb_mark(0),in_port(1) mtu 0

Are there missing parts in my kernel that do not get properly 
detected by

the feature detection?

$ ovs-appctl dpif/show-dp-features ovs_pvp_br0
Masked set action: Yes
Tunnel push pop: No
Ufid: Yes
Truncate action: Yes
Clone action: Yes
Sample nesting: 10
Conntrack eventmask: Yes
Conntrack clear: Yes
Max dp_hash algorithm: 0
Check pkt

Re: [ovs-dev] [PATCH v2 3/5] ipsec: IPv6 default route support for Libreswan

2021-04-01 Thread Ilya Maximets
On 3/31/21 10:05 AM, Mark Gray wrote:
> When configuring IPsec, "ovs-monitor-ipsec" honours
> the 'local_ip' option in the 'Interface' table by configuring
> the 'left' side of the Libreswan connection with 'local_ip'.
> If 'local_ip' is not specified, "ovs-monitor-ipsec" sets
> 'left' to '%defaultroute' which is interpreted as the IP
> address of the default gateway interface.
> 
> However, when 'remote_ip' is an IPv6 address, Libreswan
> still interprets '%defaultroute' as the IPv4 address on the
> default gateway interface (see:
> https://github.com/libreswan/libreswan/issues/416) giving
> an "address family inconsistency" error.
> 
> This patch resolves this issue by specifying the
> connection as IPv6 when the 'remote_ip' is IPv6 and
> 'local_ip' has not been set.
> 
> Signed-off-by: Mark Gray 
> ---
> v2: refactor address family parsing
>  ipsec/ovs-monitor-ipsec.in | 35 +++
>  1 file changed, 35 insertions(+)

Beside the comment I made on the previous version, this patch
looks like a bug fix unlike others in the series.  Is there
a reason why it placed in the middle of the set?  Does it have
any dependency on previous patches?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 1/2] Encap & Decap actions for MPLS packet type.

2021-04-01 Thread Martin Varghese
On Thu, Apr 01, 2021 at 08:59:27AM +0200, Eelco Chaudron wrote:
> 
> 
> On 1 Apr 2021, at 6:10, Martin Varghese wrote:
> 
> > On Wed, Mar 31, 2021 at 03:59:40PM +0200, Eelco Chaudron wrote:
> > > 
> > > 
> > > On 26 Mar 2021, at 7:21, Martin Varghese wrote:
> > > 
> > > > From: Martin Varghese 
> > > > 
> > > > The encap & decap actions are extended to support MPLS packet type.
> > > > Encap & decap actions adds and removes MPLS header at start of the
> > > > packet.
> > > 
> > > Hi Martin,
> > > 
> > > I’m trying to do some real-life testing, and I’m running into
> > > issues. This
> > > might be me setting it up wrongly but just wanting to confirm…
> > > 
> > > I’m sending an MPLS packet that contains an ARP packet into a
> > > physical port.
> > > This is the packet:
> > > 
> > > Frame 4: 64 bytes on wire (512 bits), 64 bytes captured (512 bits)
> > > Encapsulation type: Ethernet (1)
> > > [Protocols in frame: eth:ethertype:mpls:data]
> > > Ethernet II, Src: 00:00:00_00:00:01 (00:00:00:00:00:01), Dst:
> > > 00:00:00_00:00:02 (00:00:00:00:00:02)
> > > Destination: 00:00:00_00:00:02 (00:00:00:00:00:02)
> > > Address: 00:00:00_00:00:02 (00:00:00:00:00:02)
> > >  ..0.     = LG bit: Globally unique
> > > address
> > > (factory default)
> > >  ...0     = IG bit: Individual address
> > > (unicast)
> > > Source: 00:00:00_00:00:01 (00:00:00:00:00:01)
> > > Address: 00:00:00_00:00:01 (00:00:00:00:00:01)
> > >  ..0.     = LG bit: Globally unique
> > > address
> > > (factory default)
> > >  ...0     = IG bit: Individual address
> > > (unicast)
> > > Type: MPLS label switched packet (0x8847)
> > > MultiProtocol Label Switching Header, Label: 100, Exp: 0, S: 1, TTL:
> > > 64
> > >    0110 0100    = MPLS Label: 100
> > >      000.   = MPLS Experimental
> > > Bits: 0
> > >      ...1   = MPLS Bottom Of Label
> > > Stack: 1
> > >       0100  = MPLS TTL: 64
> > > Data (46 bytes)
> > > 
> > >   ff ff ff ff ff ff 52 54 00 88 51 38 08 06 00 01
> > > ..RT..Q8
> > > 0010  08 00 06 04 00 01 52 54 00 88 51 38 01 01 01 65
> > > ..RT..Q8...e
> > > 0020  00 00 00 00 00 00 01 01 01 64 27 98 a0 47
> > > .d'..G
> > > Data:
> > > 5254008851380806000108000604000152540088513801010165?
> > > 
> > > 
> > > I’m trying to use the following rules:
> > > 
> > >   ovs-ofctl del-flows ovs_pvp_br0
> > >   ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
> > > "priority=100,dl_type=0x8847,mpls_label=100
> > > actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)"
> > >   ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0 "table=3,priority=10
> > > actions=normal"
> > > 
> > > With these, I expect the packet to be sent to vnet0, but it’s not.
> > > Actually,
> > > the datapath rule looks odd, while the userspace rules seem to match:
> > > 
> > >   $ ovs-dpctl dump-flows
> > >   
> > > recirc_id(0),in_port(1),eth(),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=0/0x0,bos=1/1),
> > > packets:13, bytes:1118, used:0.322s,
> > > actions:pop_eth,pop_mpls(eth_type=0x806),recirc(0x19a)
> > >   recirc_id(0x19a),in_port(1),eth_type(0x0806), packets:13,
> > > bytes:884,
> > > used:0.322s, actions:drop
> > > 
> > >   $ ovs-ofctl dump-flows ovs_pvp_br0 -O OpenFlow13
> > >   cookie=0x0, duration=85.007s, table=0, n_packets=51, n_bytes=4386,
> > > priority=100,mpls,mpls_label=100
> > > actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)
> > >   cookie=0x0, duration=84.990s, table=3, n_packets=51, n_bytes=3468,
> > > priority=10 actions=NORMAL
> > > 
> > The inner packet is ethernet. So the packet type should be (ns=0,type=0)
> > ?
> 
> Forgot to add that I already tried that to start with, based on the example,
> but as that did not work I tried 0x806.
> 
> PS: I have this as a remark in my review notes, i.e., to explain the ns and
> type usage here.
> 
> 
> This resulted in packets being counted at the open flow level, but it
> results in NO data path rules. Do get an error though:
> 
> 2021-04-01T06:53:36.056Z|00141|dpif(handler37)|WARN|system@ovs-system:
> failed to put[create] (Invalid argument)
> ufid:3d2d6f6d-5a66-4ace-8b09-7cdcfa5efc8e 
> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:01/00:00:00:00:00:00,dst=00:00:00:00:00:02/00:00:00:00:00:00),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=64/0x0,bos=1/1),
> actions:pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)

This set(eth) before the recirc is the problem i guesss. I need to check
> 2021-04-01T06:53:36.056Z|00142|dpif(handler37)|WARN|system@ovs-system:
> execute pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c) failed
> (Invalid argument) on packet 

Re: [ovs-dev] [PATCH 3/4] ipsec: IPv6 default route support for Libreswan

2021-04-01 Thread Ilya Maximets
On 3/30/21 6:15 PM, Mark Gray wrote:
> On 30/03/2021 15:28, Aaron Conole wrote:
>> Mark Gray  writes:
>>
>>> When configuring IPsec, "ovs-monitor-ipsec" honours
>>> the 'local_ip' option in the 'Interface' table by configuring
>>> the 'left' side of the Libreswan connection with 'local_ip'.
>>> If 'local_ip' is not specified, "ovs-monitor-ipsec" sets
>>> 'left' to '%defaultroute' which is interpreted as the IP
>>> address of the default gateway interface.
>>>
>>> However, when 'remote_ip' is an IPv6 address, Libreswan
>>> still interprets '%defaultroute' as the IPv4 address on the
>>> default gateway interface (see:
>>> https://github.com/libreswan/libreswan/issues/416) giving
>>> an "address family inconsistency" error.
>>>
>>> This patch resolves this issue by specifying the
>>> connection as IPv6 when the 'remote_ip' is IPv6 and
>>> 'local_ip' has not been set.
>>>
>>> Signed-off-by: Mark Gray 
>>> ---
>>>  ipsec/ovs-monitor-ipsec.in | 54 +-
>>>  1 file changed, 53 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in
>>> index 9f412aaaf25a..b8cfb0a8ae79 100755
>>> --- a/ipsec/ovs-monitor-ipsec.in
>>> +++ b/ipsec/ovs-monitor-ipsec.in
>>> @@ -14,10 +14,11 @@
>>>  # limitations under the License.
>>>  
>>>  import argparse
>>> +import copy
>>
>> I think it's okay to get things in alphabetical order, but it's not
>> related.
> 
> I'll change back
> 
>>
>>> +import ipaddress
>>>  import re
>>>  import subprocess
>>>  import sys
>>> -import copy
>>>  import os
>>>  from string import Template
>>>  
>>> @@ -413,6 +414,11 @@ conn prevent_unencrypted_vxlan
>>>  leftprotoport=udp/4789
>>>  mark={0}
>>>  
>>> +"""
>>> +
>>> +IPV6_CONN = """\
>>> +hostaddrfamily=ipv6
>>> +clientaddrfamily=ipv6
>>>  """
>>>  
>>>  auth_tmpl = {"psk": Template("""\
>>> @@ -528,6 +534,9 @@ conn prevent_unencrypted_vxlan
>>>  else:
>>>  auth_section = self.auth_tmpl["pki_ca"].substitute(tunnel.conf)
>>>  
>>> +if tunnel.conf["address_family"] == "IPv6":
>>> +auth_section = self.IPV6_CONN + auth_section
>>> +
>>>  vals = tunnel.conf.copy()
>>>  vals["auth_section"] = auth_section
>>>  vals["version"] = tunnel.version
>>> @@ -795,6 +804,7 @@ class IPsecTunnel(object):
>>>Tunnel Type:$tunnel_type
>>>Local IP:   $local_ip
>>>Remote IP:  $remote_ip
>>> +  Address Family: $address_family
>>>SKB mark:   $skb_mark
>>>Local cert: $certificate
>>>Local name: $local_name
>>> @@ -836,6 +846,9 @@ class IPsecTunnel(object):
>>>  "tunnel_type": row.type,
>>>  "local_ip": options.get("local_ip", "%defaultroute"),
>>>  "remote_ip": options.get("remote_ip"),
>>> +"address_family": self._get_conn_address_family(
>>> +   
>>> options.get("remote_ip"),
>>> +   
>>> options.get("local_ip")),
>>>  "skb_mark": monitor.conf["skb_mark"],
>>>  "certificate": monitor.conf["pki"]["certificate"],
>>>  "private_key": monitor.conf["pki"]["private_key"],
>>> @@ -904,6 +917,24 @@ class IPsecTunnel(object):
>>>  
>>>  return header + conf + status + spds + sas + cons + "\n"
>>>  
>>> +def _get_conn_address_family(self, remote_ip, local_ip):
>>> +remote = address_family(remote_ip)
>>> +local = address_family(local_ip)
>>> +
>>> +if local == "IPv4" and remote == "IPv4":
>>> +return "IPv4"
>>> +elif local == "IPv6" and remote == "IPv6":
>>> +return "IPv6"
>>> +elif remote == "IPv4" and local_ip is None:
>>> +return "IPv4"
>>> +elif remote == "IPv6" and local_ip is None:
>>> +return "IPv6"
>>> +elif remote != local:
>>> +# remote family and local family are mismatched
>>> +return None
>>> +else:
>>> +return None
>>> +
>>
>> I think we can shrink this whole section to:
>>
>>
>> def _get_conn_address_family(self, remote_ip, local_ip):
>> remote = address_family(remote_ip)
>> local = address_family(local_ip)
>>
>> if local is None:
>>return remote
>> elif local != remote:
>>return None
>>
>> return remote
> 
> Yes, you are right. Simpler.
>>
>>
>>>  def _is_valid_tunnel_conf(self):
>>>  """This function verifies if IPsec tunnel has valid configuration
>>>  set in 'conf'.  If it is valid, then it returns True.  Otherwise,
>>> @@ -1160,6 +1191,27 @@ class IPsecMonitor(object):
>>>  
>>>  return m.group(1)
>>>  
>>> +def is_ipv4(address):
>>> +try:
>>> +ipaddress.IPv4Address(address)
>>> +except ipaddress.AddressValueError:
>>> +return False
>>> +return True
>>> +
>>> +def is_ipv6(address):
>>> +try:

Re: [ovs-dev] [PATCH v4 1/2] Encap & Decap actions for MPLS packet type.

2021-04-01 Thread Eelco Chaudron



On 1 Apr 2021, at 8:59, Eelco Chaudron wrote:


On 1 Apr 2021, at 6:10, Martin Varghese wrote:


On Wed, Mar 31, 2021 at 03:59:40PM +0200, Eelco Chaudron wrote:



On 26 Mar 2021, at 7:21, Martin Varghese wrote:


From: Martin Varghese 

The encap & decap actions are extended to support MPLS packet type.
Encap & decap actions adds and removes MPLS header at start of the
packet.


Hi Martin,

I’m trying to do some real-life testing, and I’m running into 
issues. This

might be me setting it up wrongly but just wanting to confirm…

I’m sending an MPLS packet that contains an ARP packet into a 
physical port.

This is the packet:

Frame 4: 64 bytes on wire (512 bits), 64 bytes captured (512 bits)
Encapsulation type: Ethernet (1)
[Protocols in frame: eth:ethertype:mpls:data]
Ethernet II, Src: 00:00:00_00:00:01 (00:00:00:00:00:01), Dst:
00:00:00_00:00:02 (00:00:00:00:00:02)
Destination: 00:00:00_00:00:02 (00:00:00:00:00:02)
Address: 00:00:00_00:00:02 (00:00:00:00:00:02)
 ..0.     = LG bit: Globally unique 
address

(factory default)
 ...0     = IG bit: Individual address 
(unicast)

Source: 00:00:00_00:00:01 (00:00:00:00:00:01)
Address: 00:00:00_00:00:01 (00:00:00:00:00:01)
 ..0.     = LG bit: Globally unique 
address

(factory default)
 ...0     = IG bit: Individual address 
(unicast)

Type: MPLS label switched packet (0x8847)
MultiProtocol Label Switching Header, Label: 100, Exp: 0, S: 1, TTL: 
64

   0110 0100    = MPLS Label: 100
     000.   = MPLS Experimental 
Bits: 0
     ...1   = MPLS Bottom Of Label 
Stack: 1

      0100  = MPLS TTL: 64
Data (46 bytes)

  ff ff ff ff ff ff 52 54 00 88 51 38 08 06 00 01   
..RT..Q8
0010  08 00 06 04 00 01 52 54 00 88 51 38 01 01 01 65   
..RT..Q8...e
0020  00 00 00 00 00 00 01 01 01 64 27 98 a0 47 
.d'..G

Data:
5254008851380806000108000604000152540088513801010165?


I’m trying to use the following rules:

  ovs-ofctl del-flows ovs_pvp_br0
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0
"priority=100,dl_type=0x8847,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)"
  ovs-ofctl add-flow -O OpenFlow13 ovs_pvp_br0 "table=3,priority=10
actions=normal"

With these, I expect the packet to be sent to vnet0, but it’s not. 
Actually,
the datapath rule looks odd, while the userspace rules seem to 
match:


  $ ovs-dpctl dump-flows
  
recirc_id(0),in_port(1),eth(),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=0/0x0,bos=1/1),
packets:13, bytes:1118, used:0.322s,
actions:pop_eth,pop_mpls(eth_type=0x806),recirc(0x19a)
  recirc_id(0x19a),in_port(1),eth_type(0x0806), packets:13, 
bytes:884,

used:0.322s, actions:drop

  $ ovs-ofctl dump-flows ovs_pvp_br0 -O OpenFlow13
  cookie=0x0, duration=85.007s, table=0, n_packets=51, n_bytes=4386,
priority=100,mpls,mpls_label=100
actions=decap(),decap(packet_type(ns=0,type=0x806)),resubmit(,3)
  cookie=0x0, duration=84.990s, table=3, n_packets=51, n_bytes=3468,
priority=10 actions=NORMAL

The inner packet is ethernet. So the packet type should be 
(ns=0,type=0)

?


Forgot to add that I already tried that to start with, based on the 
example, but as that did not work I tried 0x806.


PS: I have this as a remark in my review notes, i.e., to explain the 
ns and type usage here.



This resulted in packets being counted at the open flow level, but it 
results in NO data path rules. Do get an error though:


2021-04-01T06:53:36.056Z|00141|dpif(handler37)|WARN|system@ovs-system: 
failed to put[create] (Invalid argument) 
ufid:3d2d6f6d-5a66-4ace-8b09-7cdcfa5efc8e 
recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=00:00:00:00:00:01/00:00:00:00:00:00,dst=00:00:00:00:00:02/00:00:00:00:00:00),eth_type(0x8847),mpls(label=100/0xf,tc=0/0,ttl=64/0x0,bos=1/1), 
actions:pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c)
2021-04-01T06:53:36.056Z|00142|dpif(handler37)|WARN|system@ovs-system: 
execute pop_eth,pop_mpls(eth_type=0x6558),set(eth()),recirc(0x4c) 
failed (Invalid argument) on packet 
mpls,vlan_tci=0x,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,mpls_label=100,mpls_tc=0,mpls_ttl=64,mpls_bos=1

 with metadata skb_priority(0),skb_mark(0),in_port(1) mtu 0

Are there missing parts in my kernel that do not get properly detected 
by the feature detection?


Just to be sure I build the “lastest” net kernel, e43accba9b07, 
5.12.0-rc2+, and I see the same problem.



$ ovs-appctl dpif/show-dp-features ovs_pvp_br0
Masked set action: Yes
Tunnel push pop: No
Ufid: Yes
Truncate action: Yes
Clone action: Yes
Sample nesting: 10
Conntrack eventmask: Yes
Conntrack clear: Yes
Max dp_hash algorithm: 0
Check pkt lengt

Re: [ovs-dev] [PATCH v2 2/2] testsuite: add test cases for ingress_policing parameters

2021-04-01 Thread Simon Horman
On Thu, Mar 25, 2021 at 04:21:34PM -0300, Marcelo Ricardo Leitner wrote:
> On Fri, Mar 12, 2021 at 12:59:17PM +0100, Simon Horman wrote:
> > --- a/tests/system-offloads-traffic.at
> > +++ b/tests/system-offloads-traffic.at
> > @@ -70,3 +70,50 @@ AT_CHECK([ovs-appctl upcall/show | grep -E "offloaded 
> > flows : [[1-9]]"], [0], [i
> >  
> >  OVS_TRAFFIC_VSWITCHD_STOP
> >  AT_CLEANUP
> > +
> > +AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst 
> > - offloads disabled])
> > +AT_KEYWORDS([ingress_policing])
> > +OVS_TRAFFIC_VSWITCHD_START()
> > +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-offload=false])
> > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> > +ADD_NAMESPACES(at_ns0)
> > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
> > +AT_CHECK([ovs-vsctl set interface ovs-p0 ingress_policing_rate=100])
> > +AT_CHECK([ovs-vsctl set interface ovs-p0 ingress_policing_burst=10])
> > +AT_CHECK([ovs-vsctl list open | grep other_config > other_config.txt])
> > +AT_CHECK([cat other_config.txt],[0],
> > +[other_config: {hw-offload="false"}
> > +])
> > +AT_CHECK([tc -s -d filter show dev ovs-p0 ingress |grep rate| awk 
> > '{a=index($0,"rate");b=index($0,"mtu");print substr($0,a,b-a-1)}' > 
> > tc_ovs-p0.txt ],[0],[])
> 
> Uhh.. I wanted to say "just use tc -json ... | jq ..." to avoid all this
> parsing, but the action police doesn't support it yet. :-(
> 
> > +AT_CHECK([cat tc_ovs-p0.txt],[0],
> > +[rate 100Kbit burst 1280b
> 
> This ties back to my previous email. Is 1280b really what is expected
> here? (for both flavors)
> If 100K was kept as 100K, seems 'K' is what is wanted.
> Then, the burst would be 1250b.
> 
> While reading the code on this, now noticed that in
> netdev_dpdk_policer_construct() it uses 1000 for both.
> 
> Anyhow, good that this is now getting asserted.
> 
> Other than this and comments already made by Tonghao Zhang, I have no
> further comments.

Thanks. I did not notice Tonghao Zhang's comments until after I sent my
"ping, please review" email to you. But we do see them. And we are working
on resolving that problem - the test should be skipped if not supported
by the environment, AFAIC.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [oss-drivers] Re: [PATCH v2 1/2] netdev-linux: correct unit of burst parameter

2021-04-01 Thread Simon Horman
On Thu, Mar 25, 2021 at 03:51:11PM -0300, Marcelo Ricardo Leitner wrote:
> On Fri, Mar 12, 2021 at 12:59:16PM +0100, Simon Horman wrote:
> > From: "Yong.Xu" 
> > 
> > Correct calculation of burst parameter used when configuring TC policer
> > action for ingress port-based policing in the case where TC offload is in
> > use. This now matches the value calculated for the case where TC offload is
> > not in use.
> > 
> > The division by 8 is to convert from bits to bytes.
> > Its unclear why 64 was previously used.
> 
> Yeah.. I have the feeling that it might be related to kernel's:
>   /* Avoid doing 64 bit divide */
>   #define PSCHED_SHIFT6
>   #define PSCHED_TICKS2NS(x)  ((s64)(x) << PSCHED_SHIFT)
>   #define PSCHED_NS2TICKS(x)  ((x) >> PSCHED_SHIFT)
> but I can't confirm it.

That is a pretty good point, and I suspect you are correct.
But I honestly don't know either.

> > Fixes: e7f6ba220 ("lib/tc: add ingress ratelimiting support for tc-offload")
> > Signed-off-by: Yong.Xu 
> > [simon: reworked changelog]
> > Signed-off-by: Simon Horman 
> > Signed-off-by: Louis Peens 
> > ---
> >  lib/netdev-linux.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> > index 15b25084b..f87a20075 100644
> > --- a/lib/netdev-linux.c
> > +++ b/lib/netdev-linux.c
> > @@ -2572,7 +2572,7 @@ exit:
> >  static struct tc_police
> >  tc_matchall_fill_police(uint32_t kbits_rate, uint32_t kbits_burst)
> >  {
> > -unsigned int bsize = MIN(UINT32_MAX / 1024, kbits_burst) * 1024 / 64;
> > +unsigned int bsize = MIN(UINT32_MAX / 1024, kbits_burst) * 1024 / 8;
> >  unsigned int bps = ((uint64_t) kbits_rate * 1000) / 8;
> 
> I know that the patch is not changing this but, while at this, why for
> bsize the 'k' is 1024, while for bps it's 1000?
> 
> AFAICT it backtracks to
>   netdev_set_policing(iface->netdev,
>   MIN(UINT32_MAX, iface->cfg->ingress_policing_rate),
>   MIN(UINT32_MAX, 
> iface->cfg->ingress_policing_burst));
> in iface_configure_qos() and I don't see a reason for them being
> different.
> 
> qos.rst states:
> ``ingress_policing_rate``
>   the maximum rate (in Kbps) that this VM should be allowed to send
> 
> ``ingress_policing_burst``
>   a parameter to the policing algorithm to indicate the maximum amount of data
>   (in Kb) that this interface can send beyond the policing rate.
> 
> Both with capital K. So if the 64 was bothering, the 1024 is likely
> doing so as well.

Thanks, I seem to recall looking into this once. I will dig once again and
try and answer the question. Possibly then we will then
decide to change things. Let's see.

> Nevertheless, as the patch is fixing the /64, patch LGTM.
> 
> >  struct tc_police police;
> >  struct tc_ratespec rate;
> > -- 
> > 2.20.1
> > 
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev