OVN routers are configured to drop any traffic with a destination
being one of the Reserved Multicast Addresses (RFC 4291).

This is done by matching on all the bits of ipv6.dst, except for bits
112-116 that cover all the addresses.  Once installed into OVS, this
turns into a following match:

  ipv6_dst=ff00::/fff0:ffff:ffff:ffff:ffff:ffff:ffff:ffff

We fixed a large chunk of IPv6 datapath flow explosion issues by
turning on prefix tacking in the flow classifier in OVS in commit
89e43f7528b0 ("controller: Fix IPv6 dp flow explosion by setting flow
table prefixes.").  However, prefix tracking doesn't work for masks
that are not contiguous.  That means that if a packet reaches a
classifier subtable with non-contiguous mask, all the bits of that
mask will be un-wildcarded.  It's not a huge problem in a general case,
because most non-contiguous masks would typically match on just a few
bits.  But ip6.mcast_rsvd is matching on 124 bits, un-wildcarding them
for most of the IPv6 traffic traversing a router and causing creation
of a separate exact-match datapath flow per destination IP.

For setups that handle large amount of traffic from many different
external addresses this issue makes IPv6 handling significantly harder
than IPv4, causing much higher load on the datapath with potential
overflow of datapath flow tables and a subsequent upcall storm.
Even without the overflow, OVS spends a lot of time revalidating all
these datapath flows burning CPU cycles.

In general, since the number of external IP addresses is virtually
unlimited, there should be no configuration where OVN exact-matches
them, otherwise it will be a significant datapath scaling issue.

Fix that by replacing a non-contiguous bit-match with a match on an
address set where all the reserved multicast addresses are just listed
directly.  There are only 16 of them, so this should not be a huge
problem to have extra 15 OpenFlow rules per router, but it will allow
OVS to use prefix tracking for these flows and avoid creating separate
datapath flow per destination IP.

Also adding a simple lsp-to-external routing test case to make sure
we don't have exact matches in this simple common use case.

The OVS classifier can likely be improved to handle non-contiguous
masks better, but it's not how the prefix tracking is designed, so
it's not a simple task.

Fixes: 677a3ba4d66b ("ovn: Add MLD support.")
Reported-at: https://issues.redhat.com/browse/FDP-1557
Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
---
 lib/logical-fields.c |   8 ++-
 tests/ovn.at         | 151 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 157 insertions(+), 2 deletions(-)

diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index e479a78c1..f19eb579b 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -266,8 +266,12 @@ ovn_init_symtab(struct shash *symtab)
 
     /* Predefined IPv6 multicast groups (RFC 4291, 2.7.1). */
     expr_symtab_add_predicate(symtab, "ip6.mcast_rsvd",
-                              "ip6.dst[116..127] == 0xff0 && "
-                              "ip6.dst[0..111] == 0x0");
+                              "ip6.dst == { "
+                                  "ff00::0, ff01::0, ff02::0, ff03::0, "
+                                  "ff04::0, ff05::0, ff06::0, ff07::0, "
+                                  "ff08::0, ff09::0, ff0a::0, ff0b::0, "
+                                  "ff0c::0, ff0d::0, ff0e::0, ff0f::0 "
+                               "}");
     expr_symtab_add_predicate(symtab, "ip6.mcast_all_nodes",
                               "ip6.dst == ff01::1 || ip6.dst == ff02::1");
     expr_symtab_add_predicate(symtab, "ip6.mcast_all_rtrs",
diff --git a/tests/ovn.at b/tests/ovn.at
index 0dabec8d9..18ce07e1a 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -41134,6 +41134,157 @@ OVN_CHECK_PACKETS([hv/vif1-tx.pcap], [expected-vif1])
 AT_CLEANUP
 ])
 
+dnl This test checks that the megaflows translated by ovs-vswitchd don't
+dnl have extensive matches on external IP addresses for simple routing.
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([IPv4/v6 routing to external - megaflow check for src/dst matches])
+AT_SKIP_IF([test $HAVE_SCAPY = no])
+ovn_start
+
+check ovn-nbctl ls-add sw0
+
+check ovn-nbctl lsp-add sw0 vm0
+check ovn-nbctl lsp-set-addresses vm0 "f0:00:0f:01:02:03 10.0.0.3 1000::3"
+
+check ovn-nbctl ls-add sw1
+
+check ovn-nbctl lsp-add sw1 ext
+check ovn-nbctl lsp-set-addresses ext unknown
+check ovn-nbctl lsp-set-type ext localnet
+check ovn-nbctl lsp-set-options ext network_name=phys
+
+check ovn-nbctl lr-add lr0
+
+check ovn-nbctl lrp-add lr0 lr0-sw0 fa:16:3e:00:00:01 10.0.0.250/24 1000::f0/64
+check ovn-nbctl lsp-add sw0 sw0-lr0
+check ovn-nbctl lsp-set-type sw0-lr0 router
+check ovn-nbctl lsp-set-addresses sw0-lr0 router
+check ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0
+
+check ovn-nbctl lrp-add lr0 lr0-sw1 fa:16:3e:00:00:02 20.0.0.250/24 2000::f0/64
+check ovn-nbctl lsp-add sw1 sw1-lr0
+check ovn-nbctl lsp-set-type sw1-lr0 router
+check ovn-nbctl lsp-set-addresses sw1-lr0 router
+check ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1
+
+dnl Add default routes for the external gateway.
+check ovn-nbctl lr-route-add lr0 "0.0.0.0/0" 20.0.0.254 lr0-sw1
+check ovn-nbctl lr-route-add lr0 "::/0" 2000::fe lr0-sw1
+
+net_add n1
+sim_add hv
+as hv
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+check ovs-vsctl add-port br-int vif1 -- \
+    set Interface vif1 external-ids:iface-id=vm0 \
+    options:tx_pcap=hv/vif1-tx.pcap \
+    options:rxq_pcap=hv/vif1-rx.pcap \
+    ofport-request=1
+
+check ovn-nbctl --wait=sb sync
+wait_for_ports_up
+
+dnl Create MAC binding entries for the external gateway, so OVN doesn't need
+dnl to ARP/ND for it.
+lr0_dp=$(fetch_column Datapath_Binding _uuid external_ids:name=lr0)
+check_uuid ovn-sbctl create mac_binding datapath=$lr0_dp logical_port=lr0-sw1 \
+    ip=\"2000::fe\" mac=\"f0:00:0f:01:02:fe\"
+check_uuid ovn-sbctl create mac_binding datapath=$lr0_dp logical_port=lr0-sw1 \
+    ip=\"20.0.0.254\" mac=\"f0:00:0f:01:02:fe\"
+check ovn-nbctl --wait=hv sync
+
+AS_BOX([IPv6 - from external to vm0])
+packet=$(fmt_pkt "Ether(dst='fa:16:3e:00:00:02', src='f0:00:0f:01:02:fe')/ \
+                  IPv6(dst='1000::3', src='3000::4', hlim=64)/ \
+                  UDP(sport=53, dport=4369)")
+as hv
+ovs-appctl ofproto/trace br-phys in_port=br-phys_n1 $packet --names > 
ext_ip6_ofproto_trace.txt
+check ovs-appctl netdev-dummy/receive br-phys_n1 $packet
+
+AT_CAPTURE_FILE([ext_ip6_ofproto_trace.txt])
+
+dnl Make sure the datapath flow doesn't match on a full external address.
+AT_CHECK([grep Megaflow ext_ip6_ofproto_trace.txt], [0], [stdout])
+AT_CHECK([grep Megaflow ext_ip6_ofproto_trace.txt | grep -q '3000::4'], [1])
+
+dnl Make sure that the packet was received by vm0.  The L2 addresses and the
+dnl hop limit will be different since the packet was routed.
+packet=$(fmt_pkt "Ether(dst='f0:00:0f:01:02:03', src='fa:16:3e:00:00:01')/ \
+                  IPv6(dst='1000::3', src='3000::4', hlim=63)/ \
+                  UDP(sport=53, dport=4369)")
+echo $packet >> expected-vif1
+OVN_CHECK_PACKETS([hv/vif1-tx.pcap], [expected-vif1])
+
+AS_BOX([IPv6 - from vm0 to external])
+packet=$(fmt_pkt "Ether(dst='fa:16:3e:00:00:01', src='f0:00:0f:01:02:03')/ \
+                  IPv6(dst='3000::4', src='1000::3', hlim=64)/ \
+                  UDP(sport=53, dport=4369)")
+as hv
+ovs-appctl ofproto/trace br-int in_port=vif1 $packet --names > 
vm0_ip6_ofproto_trace.txt
+check ovs-appctl netdev-dummy/receive vif1 $packet
+
+AT_CAPTURE_FILE([vm0_ip6_ofproto_trace.txt])
+
+dnl Make sure the datapath flow doesn't match on a full external address.
+AT_CHECK([grep Megaflow vm0_ip6_ofproto_trace.txt], [0], [stdout])
+AT_CHECK([grep Megaflow vm0_ip6_ofproto_trace.txt | grep -q '3000::4'], [1])
+
+dnl Make sure that the packet was received externally.  The L2 addresses and
+dnl the hop limit will be different since the packet was routed.
+packet=$(fmt_pkt "Ether(dst='f0:00:0f:01:02:fe', src='fa:16:3e:00:00:02')/ \
+                  IPv6(dst='3000::4', src='1000::3', hlim=63)/ \
+                  UDP(sport=53, dport=4369)")
+echo $packet >> expected-ext
+OVN_CHECK_PACKETS([hv/br-phys_n1-tx.pcap], [expected-ext])
+
+AS_BOX([IPv4 - from external to vm0])
+packet=$(fmt_pkt "Ether(dst='fa:16:3e:00:00:02', src='f0:00:0f:01:02:fe')/ \
+                  IP(dst='10.0.0.3', src='30.0.0.4', ttl=64)/ \
+                  UDP(sport=53, dport=4369)")
+as hv
+ovs-appctl ofproto/trace br-phys in_port=br-phys_n1 $packet --names > 
ext_ip4_ofproto_trace.txt
+check ovs-appctl netdev-dummy/receive br-phys_n1 $packet
+
+AT_CAPTURE_FILE([ext_ip4_ofproto_trace.txt])
+
+dnl Make sure the datapath flow doesn't match on a full external address.
+AT_CHECK([grep Megaflow ext_ip4_ofproto_trace.txt], [0], [stdout])
+AT_CHECK([grep Megaflow ext_ip4_ofproto_trace.txt | grep -q '30.0.0.4'], [1])
+
+dnl Make sure that the packet was received by vm0.  The L2 addresses and the
+dnl hop limit will be different since the packet was routed.
+packet=$(fmt_pkt "Ether(dst='f0:00:0f:01:02:03', src='fa:16:3e:00:00:01')/ \
+                  IP(dst='10.0.0.3', src='30.0.0.4', ttl=63)/ \
+                  UDP(sport=53, dport=4369)")
+echo $packet >> expected-vif1
+OVN_CHECK_PACKETS([hv/vif1-tx.pcap], [expected-vif1])
+
+AS_BOX([IPv4 - from vm0 to external])
+packet=$(fmt_pkt "Ether(dst='fa:16:3e:00:00:01', src='f0:00:0f:01:02:03')/ \
+                  IP(dst='30.0.0.4', src='10.0.0.3', ttl=64)/ \
+                  UDP(sport=53, dport=4369)")
+as hv
+ovs-appctl ofproto/trace br-int in_port=vif1 $packet --names > 
vm0_ip4_ofproto_trace.txt
+check ovs-appctl netdev-dummy/receive vif1 $packet
+
+AT_CAPTURE_FILE([vm0_ip4_ofproto_trace.txt])
+
+dnl Make sure the datapath flow doesn't match on a full external address.
+AT_CHECK([grep Megaflow vm0_ip4_ofproto_trace.txt], [0], [stdout])
+AT_CHECK([grep Megaflow vm0_ip4_ofproto_trace.txt | grep -q '30.0.0.4'], [1])
+
+dnl Make sure that the packet was received externally.  The L2 addresses and
+dnl the hop limit will be different since the packet was routed.
+packet=$(fmt_pkt "Ether(dst='f0:00:0f:01:02:fe', src='fa:16:3e:00:00:02')/ \
+                  IP(dst='30.0.0.4', src='10.0.0.3', ttl=63)/ \
+                  UDP(sport=53, dport=4369)")
+echo $packet >> expected-ext
+OVN_CHECK_PACKETS([hv/br-phys_n1-tx.pcap], [expected-ext])
+
+AT_CLEANUP
+])
 
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([Multichassis port I-P processing])
-- 
2.50.1

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to