Hi Martin, Frode,

I'd like to summarize OVN technical meeting discussion.

First, there was a discussion about new option naming. Ales proposed terms "forward" and "redirect" (IIRC). Do we want to reflect the exact behavior of new option? "redirect" from my perspective could be a good candidate.

Please correct me if I'm wrong in next 4 terms:

1. In current BGP support patch series [0] OVN only installs NAT and LB VIP addresses into a separate routing table via Netlink which is then should be redistributed by external routing daemon (FRR, Bird, etc.). External routing daemon is configured outside of OVN. 2. OVN doesn't import routes received and installed into separate kernel routing table by routing daemon. "OVN to outside" direction routes are configured as normal Logical_Router_Static_Route in OVN_Northbound by CMS/external automation, while "outside to OVN" direction routes are installed on the external router(s) automatically with BGP. 3. If user has more then one BGP peering with Leaf/external Router and needs fast (sub-second) fail over for "OVN to outside" direction, BFD should be configured for ECMP Logical_Router_Static_Routes from OVN side with these external router IPs as a nexthop group. On external router side BFD within BGP must be configured. 4. If to forward BFD traffic from LRP to "bgp daemon" LSP unconditionally, functionality from #3 will be broken.

I'm wondering, if we don't configure BFD for ECMP routes from OVNand use external tooling for routes learning, could we conditionally add BFD redirecting rules with a separate option? Would it be safe or there are any negative consequences?

0: https://patchwork.ozlabs.org/project/ovn/list/?series=416659

On 26.07.2024 23:10, Vladislav Odintsov wrote:
On 26.07.2024 21:21, martin.kal...@canonical.com wrote:
On Fri, 2024-07-26 at 21:07 +0700, Vladislav Odintsov wrote:
Hi Frode,

On 26.07.2024 19:17, Frode Nordahl wrote:
On Fri, Jul 26, 2024 at 11:28 AM Vladislav Odintsov
<odiv...@gmail.com> wrote:
Hi Martin, thanks for the patch.

Typically for faster BGP (or other routing protocols) convergence
BFD
signalling is used. Would you mind adding flows to forward BFD
traffic
to the same LSP as it is already done for BGP?

It could be an additional option like ("enable-bfd-for-bgp") or
something like this, or we can install flows unconditionally.
UDP/3789
is the default BFD proto/port.
BFD is indeed an important part for fast convergence of routing
protocols, it is however also an important part of end to end
liveness
detection for a data path.

In this work our goal is to exchange control plane information with
an
external daemon so that it can take care of the routing protocol
state
machine. We do want to keep the data path in OVS so that users can
benefit from all the data path implementations it has to offer,
including hardware acceleration.

With this in mind, does it really make sense for the external
daemon
to speak BFD, or would it be better to integrate with the OVN
managed
BFD for static routes which is implemented in the OVS data path?
Typically BFD for routing protocols is configured in routing daemons
(on
both sides of peering), because main routing daemon (e.g. bgpd) has
to
get notifications from BFD engine (e.g. bfdd), that the connection is
lost. OVS-based BFD sessions seems nothing to do here. I proposed to
install "redirect" flows similar to BGP: forward udp/3789 to
dedicated
LSP for routing daemon. After installing openflow control plane is
not
needed for BFD to work. OVS datapath in this case just forwards
traffic
from external network (for instance, leaf switch) to internal OVS
port
to routing daemon).

Hope this explains the idea more clear.
Wouldn't this be like having multiple "sources of truth" in the system?
On one hand there's OVN injecting routes [0], that are picked up by the
BGP daemon, and on the other there's a BFD daemon that will be removing
them if it believes that they are unreachable. Couldn't this lead to
some flapping?
It shouldn't. Just in case - I'm not talking about OVN BFD for static routes feature. I mean BFD within routing daemon. BFD daemon is just a "sidecar" for the BGP to notify the latter that the connectivity is lost. After BFD detects connectivity failure it notifies BGP and it terminates BGP session and removes routes learnt from dead peer. [0] This works for both sides: for routing daemon on the "OVN side" and for external BGP speaker. This can protect against 2 types of failures:

1. L3 Gateway failure: power outage, physical disconnection, kernel panic, OVS failure, etc. If ha-chassis-group is configured, other OVN cluster nodes will detect this failure though OVS-based BFD and trigger failover to the next ha-chassis in the group. At the same time external BGP speaker will also detect that BFD session went down and terminate BGP session with routing daemon and send BGP update with removal of routes through itself, because they are not reachable anymore. Then a next by priority l3gateway claims chassis-redirect LRP and start to advertise NAT/LB VIP addresses.

2. Leaf/external router failure. In case, where we have two or more peers/leafs, these leafs can go up and down. FRR should install/delete routes though each of them as fast as possible. Here BFD also comes to help to detect failures faster than BGP keepalives do. IIUC, in current BGP integration OVN doesn't import routes learned from BGP and installed by FRR into VRF. But this should be done in future so that LR could send traffic only to alive peer.

For the clarity, I've prepared a small illustration of interacting components in drawio [1] and PNG [2] formats. There are two pairs curve arrows between FRR running in a VRF and 2 external BGP speakers. The light blue and orange arrows show BGP session traffic datapaths and the dash and dash red and blue lines show BFD traffic datapaths.

Please correct me if I misunderstood your point.

0: https://docs.frrouting.org/en/latest/bfd.html#bgp-bfd-configuration

1: https://s3.k2.cloud/vlodintsov/public-artifacts/ovn-native-bgp.drawio

2: https://s3.k2.cloud/vlodintsov/public-artifacts/ovn-native-bgp.drawio.png
[0]
https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416038.html

Also, do we need to hard code on the BGP protocol or it can be
generalized so that an end user can pass a proto and optionally a
port
to forward? This can bring OSPF or other dynamic routing
protocols support.

What do you think?
While it is true that this may apply generally to other routing
protocols, it does make it more complicated to configure.
Well, I thought about it more and agree with you. To implement OSPF
there must be more work done like handling multicast traffic for a
specific IPv4 or IPv6 address. This could be done in a separate patch
when OSPF support is needed.
--
Frode Nordahl

On 25.07.2024 02:02, Mark Michelson wrote:
Hi Martin, thanks for the patch.

I have one note below, but other than that it looks good to me.

On 7/16/24 02:59, Martin Kalcok wrote:
This change adds a 'bgp-mirror' option to LRP that allows
mirroring of BGP control plane traffic to an arbitrary LSP
in its peer LS.

The option expects a string with a LSP name. When set,
any traffic entering LS that's destined for any of the
LRP's IP addresses (including IPv6 LLA) is redirected
to the LSP specified in the option's value.

This enables external BGP daemons to listen on an interface
bound to a LSP and effectively act as if they were listening
on (and speaking from) LRP's IP address.

Signed-off-by: Martin Kalcok <martin.kal...@canonical.com>
---
    northd/northd.c         | 87
+++++++++++++++++++++++++++++++++++++++++
    northd/ovn-northd.8.xml | 23 +++++++++++
    ovn-nb.xml              | 14 +++++++
    tests/ovn-northd.at     | 45 +++++++++++++++++++++
    tests/system-ovn.at     | 86
++++++++++++++++++++++++++++++++++++++++
    5 files changed, 255 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 4353df07d..e07bf68cc 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -13048,6 +13048,92 @@
build_arp_resolve_flows_for_lrp(struct
ovn_port *op,
        }
    }
    +static void
+build_bgp_redirect_rule__(
+        const char *s_addr, const char *bgp_port_name, bool
is_ipv6,
+        struct ovn_port *ls_peer, struct lflow_table
*lflows,
+        struct ds *match, struct ds *actions)
+{
+    int ip_ver = is_ipv6 ? 6 : 4;
+    /* Redirect packets in the input pipeline destined for
LR's IP to
+     * the port specified in 'bgp-mirror' option.
+     */
+    ds_clear(match);
+    ds_clear(actions);
+    ds_put_format(match, "ip%d.dst == %s && tcp.dst == 179",
ip_ver,
s_addr);
+    ds_put_format(actions, "outport = \"%s\"; output;",
bgp_port_name);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP,
100,
+                  ds_cstr(match),
+                  ds_cstr(actions),
+                  ls_peer->lflow_ref);
+
+
+    /* Drop any traffic originating from 'bgp-mirror' port
that does
+     * not originate from BGP daemon port. This blocks
unnecessary
+     * traffic like ARP broadcasts or IPv6 router
solicitation packets
+     * from the dummy 'bgp-mirror' port.
+     */
+    ds_clear(match);
+    ds_put_format(match, "inport == \"%s\"", bgp_port_name);
+    ovn_lflow_add(lflows, ls_peer->od,
S_SWITCH_IN_CHECK_PORT_SEC, 80,
+                  ds_cstr(match),
+                  REGBIT_PORT_SEC_DROP " = 1; next;",
+                  ls_peer->lflow_ref);
+
+    ds_put_format(match,
+                  " && ip%d.src == %s && tcp.src == 179",
+                  ip_ver,
+                  s_addr);
+    ovn_lflow_add(lflows, ls_peer->od,
S_SWITCH_IN_CHECK_PORT_SEC, 81,
+                  ds_cstr(match),
+                  REGBIT_PORT_SEC_DROP " =
check_in_port_sec(); next;",
+                  ls_peer->lflow_ref);
+}
+
+static void
+build_lrouter_bgp_redirect(
+        struct ovn_port *op, struct lflow_table *lflows,
+        struct ds *match, struct ds *actions)
+{
+    /* LRP has to have a peer.*/
+    if (op->peer == NULL) {
+        return;
+    }
+    /* LRP has to have NB record.*/
+    if (op->nbrp == NULL) {
+        return;
+    }
+
+    /* Proceed only for LRPs that have 'bgp-mirror' option
set.
Value of this
+     * option is the name of LSP to which a BGP traffic will
be
mirrored.
+     */
+    const char *bgp_port = smap_get(&op->nbrp->options,
"bgp-mirror");
+    if (bgp_port == NULL) {
+        return;
+    }
+
+    if (op->cr_port != NULL) {
+        static struct vlog_rate_limit rl =
VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "Option 'bgp-mirror' is not
supported on"
+                          " Distributed Gateway Port '%s'",
op->key);
+        return;
+    }
Somewhere around here would be a good place to ensure that
"bgp_port"
exists on op->peer. If the port does not exist, then print a
warning
and return early.

It would also be a good idea to add this as part of the ovn-
northd
test. Set the router port's bgp_port to a nonexistent port on
the
connected logical switch and ensure that BGP-related logical
flows are
not installed.

+
+    /* Mirror traffic destined for LRP's IPs and default BGP
port
+     * to the port defined in 'bgp-mirror' option.
+     */
+    for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs;
i++) {
+        const char *ip_s = op-
lrp_networks.ipv4_addrs[i].addr_s;
+        build_bgp_redirect_rule__(ip_s, bgp_port, false, op-
peer,
lflows,
+                                  match, actions);
+    }
+    for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs;
i++) {
+        const char *ip_s = op-
lrp_networks.ipv6_addrs[i].addr_s;
+        build_bgp_redirect_rule__(ip_s, bgp_port, true, op-
peer,
lflows,
+                                  match, actions);
+    }
+}
+
    /* This function adds ARP resolve flows related to a LSP.
*/
    static void
    build_arp_resolve_flows_for_lsp(
@@ -16003,6 +16089,7 @@
build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port *op,
lsi->meter_groups, op-
lflow_ref);
build_lrouter_icmp_packet_toobig_admin_flows(op, lsi-
lflows,
&lsi->match,
&lsi->actions, op->lflow_ref);
+    build_lrouter_bgp_redirect(op, lsi->lflows, &lsi->match,
&lsi->actions);
    }
      static void *
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-
northd.8.xml
index b06b09ac5..1bf9d2dc0 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -284,6 +284,21 @@
            dropped in the next stage.
          </li>
    +      <li>
+        For each logical port that's defined as a target of
BGP
mirroring (via
+        <code>bgp-mirror</code> option set on Logical Router
Port),
a filter is
+        set in place that disallows any traffic entering
this port
that does
+        not originate from Logical Router Port's IPs and
default BGP
port (
+        TCP 179). This filtering is achieved by two rules.
First
rule has
+        priority 81, it matches on <code>inport ==
<var>BGP_MIRROR_PORT</var>
+        &amp;&amp; ip.src == <var>LRP_IP</var> &amp;&amp;
tcp.src ==
179</code>
+        and allows traffic further with
<code>REGBIT_PORT_SEC_DROP" =
+        check_in_port_sec(); next;</code>. Second rule with
priority
80 matches
+        the rest of the traffic from that port and sets
+        <code>REGBIT_PORT_SEC_DROP" = 1; next;"</code> so
that the
packets are
+        dropped in the next stage.
+      </li>
+
          <li>
            For each (enabled) vtep logical port, a priority
70 flow is
added which
            matches on all packets and applies the action
@@ -1949,6 +1964,14 @@ output;
            on the logical switch.
          </li>
    +      <li>
+        For any logical port that's defined as a target of
BGP
mirroring (via
+        <code>bgp-mirror</code> option set on Logical Router
Port), a
+        priority-100 flow is added that redirects traffic
for
Logical Router
+        Port IPs (including IPv6 LLA) and TCP port 179, to
the targeted
+        logical port.
+      </li>
+
          <li>
            Priority-90 flows for transit switches that
forward registered
            IP multicast traffic to their corresponding
multicast group
, which
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 9552534f6..44d3591db 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -3440,6 +3440,20 @@ or
            </p>
          </column>
    +      <column name="options" key="bgp-mirror"
type='{"type":
"string"}'>
+        <p>
+          This option expects a name of a Logical Switch
Port that's
present
+          in the peer's Logical Switch. If set, it causes
any traffic
+          that's destined for Logical Router Port's IP
addresses
(including
+          its IPv6 LLA) and the default BGP port (TCP 179),
to be
mirrored
+          to the specified Logical Switch Port.
+
+          This allows external BGP daemon to be bound to a
port in
OVN's
+          Logical Switch and act as if it was listening on
Logical
Router
+          Port's IP address.
+        </p>
+      </column>
+
          <column name="options" key="gateway_mtu_bypass">
            <p>
              When configured, represents a match expression,
in the same
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index a389d1988..0c58066f6 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -12721,3 +12721,48 @@ AT_CHECK([ovn-sbctl dump-flows lr |
grep
lr_in_dnat | ovn_strip_lflows], [0], [d
      AT_CLEANUP
    ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([BGP control plane mirroring])
+ovn_start
+
+check ovn-sbctl chassis-add hv1 geneve 127.0.0.1
+
+check ovn-nbctl lr-add lr -- \
+    lrp-add lr lr-ls 02:ac:10:01:00:01 172.16.1.1/24
+check ovn-nbctl --wait=sb set logical_router lr
options:chassis=hv1
+
+check ovn-nbctl ls-add ls -- \
+    lsp-add ls ls-lr -- \
+    lsp-set-type ls-lr router -- \
+    lsp-set-addresses ls-lr router -- \
+    lsp-set-options ls-lr router-port=lr-ls
+
+check ovn-nbctl lsp-add ls lsp-bgp -- \
+    lsp-set-addresses lsp-bgp unknown
+
+# By default, no rules related to BGP mirroring are present
+AT_CHECK([ovn-sbctl dump-flows ls | grep ls_in_l2_lkup  |
grep
"tcp.dst == 179" | wc -l], [0], [0
+])
+
+AT_CHECK([ovn-sbctl dump-flows ls | grep
ls_in_check_port_sec | grep
-E "priority=80|priority=81" | wc -l], [0], [0
+])
+
+# Set "lsp-bgp" port as target of BGP control plane mirrored
traffic
+check ovn-nbctl --wait=sb set logical_router_port lr-ls
options:bgp-mirror=lsp-bgp
+
+# Check that BGP control plane traffic is redirected to
"lsp-bgp"
+AT_CHECK([ovn-sbctl dump-flows ls | grep ls_in_l2_lkup |
grep
"tcp.dst == 179" | ovn_strip_lflows], [0], [dnl
+  table=??(ls_in_l2_lkup      ), priority=100  ,
match=(ip4.dst ==
172.16.1.1 && tcp.dst == 179), action=(outport = "lsp-bgp";
output;)
+  table=??(ls_in_l2_lkup      ), priority=100  ,
match=(ip6.dst ==
fe80::ac:10ff:fe01:1 && tcp.dst == 179), action=(outport =
"lsp-bgp";
output;)
+])
+
+# Check that only BGP-related traffic is accepted on "lsp-
bgp" port
+AT_CHECK([ovn-sbctl dump-flows ls | grep
ls_in_check_port_sec | grep
-E "priority=80|priority=81" | ovn_strip_lflows], [0], [dnl
+  table=??(ls_in_check_port_sec), priority=80   ,
match=(inport ==
"lsp-bgp"), action=(reg0[[15]] = 1; next;)
+  table=??(ls_in_check_port_sec), priority=81   ,
match=(inport ==
"lsp-bgp" && ip4.src == 172.16.1.1 && tcp.src == 179),
action=(reg0[[15]] = check_in_port_sec(); next;)
+  table=??(ls_in_check_port_sec), priority=81   ,
match=(inport ==
"lsp-bgp" && ip6.src == fe80::ac:10ff:fe01:1 && tcp.src ==
179),
action=(reg0[[15]] = check_in_port_sec(); next;)
+])
+
+AT_CLEANUP
+])
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index c24ede7c5..fbcb05e59 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -13027,3 +13027,89 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed
to query
port patch-.*/d
    /connection dropped.*/d"])
    AT_CLEANUP
    ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([BGP control plane mirroring])
+ovn_start
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_BR([br-int])
+ADD_BR([br-ext])
+
+ovs-ofctl add-flow br-ext action=normal
+# Set external-ids in br-int needed for ovn-controller
+ovs-vsctl \
+        -- set Open_vSwitch . external-ids:system-id=hv1 \
+        -- set Open_vSwitch .
external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
+        -- set Open_vSwitch . external-ids:ovn-encap-
type=geneve \
+        -- set Open_vSwitch . external-ids:ovn-encap-
ip=169.0.0.1 \
+        -- set bridge br-int fail-mode=secure
other-config:disable-in-band=true
+
+# Start ovn-controller
+start_daemon ovn-controller
+
+ovn-nbctl lr-add R1 \
+    -- set Logical_Router R1 options:chassis=hv1
+
+ovn-nbctl ls-add public
+
+ovn-nbctl lrp-add R1 rp-public 00:00:02:01:02:03
172.16.1.1/24
+
+ovn-nbctl lsp-add public public-rp -- set
Logical_Switch_Port
public-rp \
+    type=router options:router-port=rp-public \
+    -- lsp-set-addresses public-rp router
+
+ovn-nbctl lsp-add public bgp-daemon \
+    -- lsp-set-addresses bgp-daemon unknown
+
+AT_CHECK([ovs-vsctl set Open_vSwitch .
external-ids:ovn-bridge-mappings=phynet:br-ext])
+ovn-nbctl lsp-add public public1 \
+        -- lsp-set-addresses public1 unknown \
+        -- lsp-set-type public1 localnet \
+        -- lsp-set-options public1 network_name=phynet
+
+ovn-nbctl --wait=hv sync
+
+# Set option that redirects BGP control plane traffic to a
LSP
"bgp-daemon"
+ovn-nbctl --wait=sb set logical_router_port rp-public
options:bgp-mirror=bgp-daemon
+
+# Create "bgp-daemon" interface in a namespace with IP and
MAC
matching LRP "rp-public"
+ADD_NAMESPACES(bgp-daemon)
+ADD_VETH(bgp-daemon, bgp-daemon, br-int, "172.16.1.1/24",
"00:00:02:01:02:03")
+
+ADD_NAMESPACES(ext-foo)
+ADD_VETH(ext-foo, ext-foo, br-ext, "172.16.1.100/24",
"00:10:10:01:02:13", \
+         "172.16.1.1")
+
+# Flip the interface down/up to get proper IPv6 LLA
+NS_EXEC([bgp-daemon], [ip link set down bgp-daemon])
+NS_EXEC([bgp-daemon], [ip link set up bgp-daemon])
+
+# Wait until IPv6 LLA loses the "tentative" flag otherwise
it can't
be bound to.
+OVS_WAIT_UNTIL([NS_EXEC([bgp-daemon], [ip a show dev bgp-
daemon |
grep "fe80::" | grep -v tentative])])
+OVS_WAIT_UNTIL([NS_EXEC([ext-foo], [ip a show dev ext-foo |
grep
"fe80::" | grep -v tentative])])
+
+# Verify that BGP control plane traffic is delivered to the
"bgp-daemon"
+# interface on both IPv4 and IPv6 LLA addresses
+NETNS_DAEMONIZE([bgp-daemon], [nc -l -k 172.16.1.1 179],
[bgp_v4.pid])
+NS_CHECK_EXEC([ext-foo], [echo "TCP test" | nc --send-only
172.16.1.1 179])
+
+NETNS_DAEMONIZE([bgp-daemon], [nc -l -6 -k
fe80::200:2ff:fe01:203%bgp-daemon 179], [bgp_v6.pid])
+NS_CHECK_EXEC([ext-foo], [echo "TCP test" | nc --send-only -
6
fe80::200:2ff:fe01:203%ext-foo 179])
+
+OVS_APP_EXIT_AND_WAIT([ovn-controller])
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([ovn-northd])
+
+as
+OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
+/.*terminating with signal 15.*/d"])
+AT_CLEANUP
+])
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to