By default OVN "fails open" for load balancer health checks: a backend
whose service monitor status is not yet known (empty) is treated as
available, so traffic is forwarded to it while the first health check
probes are still pending. This avoids disrupting traffic when health
checks are first enabled, but it also means traffic can be sent to
backends that have never been successfully probed during the initial
monitoring grace period.

Add a new "forward_unknown" key to the options column of the
Load_Balancer_Health_Check table. When set to "false", a backend with
an unknown (empty) service monitor status is treated as unavailable and
excluded from the load balancer backends until its first successful
probe. The default ("true") preserves the existing fail-open behavior,
so this change is fully backward compatible and opt-in.

No schema change is required as this reuses the existing options map.

Signed-off-by: Matteo Perin <[email protected]>
---
This patch is a proposal for a simple addition to the LB health check
configuration.

Currently, the OVN load balancer health checks fail open by default.  Until
a backend has been probed at least once, its service monitor status is empty
and the backend is treated as available.
This is a sensible default in steady state, but it can create a window of
unreliability whenever a backend is newly added to a load balancer.

Consider, for example, a CMS that manages pools of backends and lets users
attach and detach workloads dynamically.  When a new backend is added to a
pool, the CMS could add it into the OVN load balancer immediately, but the
workload behind that backend may not yet be listening on the target port.
With the current behavior, OVN will forward a share of incoming traffic to
this "unready" backend until the first few health check probes fail and mark
it offline. Connections load-balanced to it during that window simply fail.

How large this window is depends entirely on the health check timing. With
aggressive settings (e.g. interval=1, failure_count=1) the backend is marked
offline almost immediately and the problem is barely noticeable. But users
often deliberately relax these values to avoid flapping and reduce probe
overhead (e.g. interval=10, failure_count=3), which pushes the time-to-
offline to tens of seconds, which is a long time to be black-holing a
fraction of connections every time a backend is added.

The new forward_unknown=false option lets the developer opt into a
"fail-closed" behavior. A freshly added backend is excluded from the load
balancer until it has been positively verified by a successful health
check probe, eliminating the initial grace-period traffic loss.
The default remains true (fail-open) so existing deployments are unaffected.

Best Regards,
Matteo Perin

 Documentation/ref/ovn-logical-flows.7.rst |  6 +-
 NEWS                                      |  5 ++
 northd/northd.c                           | 19 +++++-
 ovn-nb.xml                                | 18 ++++++
 tests/ovn-northd.at                       | 70 +++++++++++++++++++++++
 5 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/Documentation/ref/ovn-logical-flows.7.rst 
b/Documentation/ref/ovn-logical-flows.7.rst
index ce4dd5355..75aba1283 100644
--- a/Documentation/ref/ovn-logical-flows.7.rst
+++ b/Documentation/ref/ovn-logical-flows.7.rst
@@ -612,6 +612,8 @@ Ingress Table 15: LB
   addresses of *args* is the same as the address family of *VIP*. If health
   check is enabled, then *args* will only contain those endpoints whose service
   monitor status entry in ``OVN_Southbound`` db is either ``online`` or empty.
+  Endpoints with an empty (unknown) status are excluded when the health check
+  ``forward_unknown`` option is set to ``false``.
   For IPv4 traffic the flow also loads the original destination IP and 
transport
   port in registers ``reg1`` and ``reg2``.  For IPv6 traffic the flow also 
loads
   the original destination IP and transport port in registers ``xxreg1`` and
@@ -2730,7 +2732,9 @@ flows do not get programmed for load balancers with IPv6 
*VIPs*.
   will be replaced by ``flags.skip_snat_for_lb = 1; ct_lb_mark(args;
   skip_snat);``. If health check is enabled, then *args* will only contain 
those
   endpoints whose service monitor status entry in ``OVN_Southbound`` db is
-  either ``online`` or empty.
+  either ``online`` or empty.  Endpoints with an empty (unknown) status are
+  excluded when the health check ``forward_unknown`` option is set to
+  ``false``.
 
 - For all the configured load balancing rules for a router in 
``OVN_Northbound``
   database that includes just an IP address *VIP* to match on, a priority-110
diff --git a/NEWS b/NEWS
index 748ae30eb..610af58fd 100644
--- a/NEWS
+++ b/NEWS
@@ -33,6 +33,11 @@ Post v26.03.0
      The DHCP and unbound-router ARP/ND drop lflows for external
      ports were updated to key on the external LSP's inport
      accordingly.
+   - Added "forward_unknown" option to Load_Balancer_Health_Check.  When set to
+     "false", backends whose service monitor status is still unknown (empty)
+     are treated as unavailable and excluded until their first successful
+     health check probe.  The default ("true") preserves the previous
+     fail-open behavior.
 
 OVN v26.03.0 - xxx xx xxxx
 --------------------------
diff --git a/northd/northd.c b/northd/northd.c
index 0dbf17426..e27b1f14b 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3451,6 +3451,7 @@ ovn_lb_svc_create(struct ovsdb_idl_txn *ovnsb_txn,
 
 static bool
 backend_is_available(const struct ovn_northd_lb *lb,
+                     const struct ovn_northd_lb_vip *lb_vip_nb,
                      const struct ovn_lb_backend *backend,
                      const struct ovn_northd_lb_backend *backend_nb,
                      const struct svc_monitors_map_data *svc_mons_data)
@@ -3474,9 +3475,20 @@ backend_is_available(const struct ovn_northd_lb *lb,
 
     ovs_assert(mon_info->sbrec_mon);
 
-    return (mon_info->sbrec_mon->status &&
-           strcmp(mon_info->sbrec_mon->status, "online")) ?
-           false : true;
+    const char *status = mon_info->sbrec_mon->status;
+
+    /* By default OVN "fails open": a backend whose service monitor status is
+     * not yet known (empty) is treated as available so that traffic is not
+     * disrupted while the first health check probes are still pending.  When
+     * the "forward_unknown" option is disabled, a backend with an unknown
+     * (empty) status is instead treated as unavailable until its first
+     * successful probe. */
+    if (!status || !status[0]) {
+        return smap_get_bool(&lb_vip_nb->lb_health_check->options,
+                             "forward_unknown", true);
+    }
+
+    return !strcmp(status, "online");
 }
 
 static bool
@@ -3531,6 +3543,7 @@ build_lb_vip_actions(const struct ovn_northd_lb *lb,
 
             if (backend_nb->health_check &&
                 !backend_is_available(lb,
+                                      lb_vip_nb,
                                       backend,
                                       backend_nb,
                                       svc_mons_data)) {
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 15fb1d7e8..3bb027d44 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -2858,6 +2858,24 @@ or
         The number of failure checks after which the endpoint is considered
         offline.
       </column>
+
+      <column name="options" key="forward_unknown"
+              type='{"type": "boolean"}'>
+        <p>
+          By default (<code>true</code>) a backend whose health is not yet
+          known is treated as available, so that traffic is forwarded to it
+          while the first health check probes are still pending.  This avoids
+          disrupting traffic when health checks are first enabled.
+        </p>
+
+        <p>
+          When set to <code>false</code>, a backend whose service monitor
+          status is still unknown (empty) is treated as unavailable and is
+          excluded from the load balancer backends until its first successful
+          health check probe.  This prevents traffic from being forwarded to
+          unverified backends during the initial monitoring grace period.
+        </p>
+      </column>
     </group>
 
     <group title="Common Columns">
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index f87b14c9a..abd1bc3e1 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -1798,6 +1798,76 @@ OVN_CLEANUP_NORTHD
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([load balancer health check forward_unknown option])
+ovn_start
+
+check ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80
+check ovn-nbctl --wait=sb set load_balancer . 
ip_port_mappings:10.0.0.3=sw0-p1:10.0.0.2
+check ovn-nbctl --wait=sb set load_balancer . 
ip_port_mappings:20.0.0.3=sw1-p1:20.0.0.2
+
+check ovn-nbctl ls-add sw0
+check ovn-nbctl --wait=sb lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1 \
+"00:00:00:00:00:03 10.0.0.3"
+check ovn-nbctl ls-add sw1
+check ovn-nbctl --wait=sb lsp-add sw1 sw1-p1 -- lsp-set-addresses sw1-p1 \
+"02:00:00:00:00:03 20.0.0.3"
+
+check ovn-sbctl chassis-add hv1 geneve 127.0.0.1
+check ovn-sbctl lsp-bind sw0-p1 hv1
+check ovn-sbctl lsp-bind sw1-p1 hv1
+wait_row_count nb:Logical_Switch_Port 1 name=sw0-p1 'up=true'
+wait_row_count nb:Logical_Switch_Port 1 name=sw1-p1 'up=true'
+
+check ovn-nbctl --wait=sb ls-lb-add sw0 lb1
+check_uuid ovn-nbctl --wait=sb -- --id=@hc create \
+Load_Balancer_Health_Check vip="10.0.0.10\:80" -- add Load_Balancer . \
+health_check @hc
+wait_row_count Service_Monitor 2
+
+AS_BOX([Default (fail-open): empty status backends are forwarded to.])
+AT_CAPTURE_FILE([sbflows_default])
+OVS_WAIT_FOR_OUTPUT(
+  [ovn-sbctl dump-flows sw0 | tee sbflows_default | grep 
'priority=120.*backends' | ovn_strip_lflows], 0, [dnl
+  table=??(ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && reg1[[16..23]] == 6 && reg1[[0..15]] == 80), action=(reg4 = 
10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);)
+])
+
+AS_BOX([Set forward_unknown=false: empty status backends are excluded.])
+check ovn-nbctl --wait=sb set Load_Balancer_Health_Check . \
+options:forward_unknown=false
+
+AT_CAPTURE_FILE([sbflows_unknown])
+OVS_WAIT_FOR_OUTPUT(
+  [ovn-sbctl dump-flows sw0 | tee sbflows_unknown | grep "ip4.dst == 
10.0.0.10.*reg1.*== 6.*reg1.*== 80" | grep priority=120 | grep ls_in_lb | 
ovn_strip_lflows], [0], [dnl
+  table=??(ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && reg1[[16..23]] == 6 && reg1[[0..15]] == 80), action=(drop;)
+])
+
+AS_BOX([An online backend is forwarded to even with forward_unknown=false.])
+sm_sw0_p1=$(fetch_column Service_Monitor _uuid logical_port=sw0-p1)
+check ovn-sbctl set service_monitor $sm_sw0_p1 status=online
+wait_row_count Service_Monitor 1 logical_port=sw0-p1 status=online
+check ovn-nbctl --wait=sb sync
+
+AT_CAPTURE_FILE([sbflows_online])
+OVS_WAIT_FOR_OUTPUT(
+  [ovn-sbctl dump-flows sw0 | tee sbflows_online | grep 
'priority=120.*backends' | ovn_strip_lflows], 0, [dnl
+  table=??(ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && reg1[[16..23]] == 6 && reg1[[0..15]] == 80), action=(reg4 = 
10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark(backends=10.0.0.3:80);)
+])
+
+AS_BOX([Set forward_unknown=true: empty status backends are forwarded again.])
+check ovn-nbctl --wait=sb set Load_Balancer_Health_Check . \
+options:forward_unknown=true
+
+AT_CAPTURE_FILE([sbflows_restore])
+OVS_WAIT_FOR_OUTPUT(
+  [ovn-sbctl dump-flows sw0 | tee sbflows_restore | grep 
'priority=120.*backends' | ovn_strip_lflows], 0, [dnl
+  table=??(ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && reg1[[16..23]] == 6 && reg1[[0..15]] == 80), action=(reg4 = 
10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);)
+])
+
+OVN_CLEANUP_NORTHD
+AT_CLEANUP
+])
+
 OVN_FOR_EACH_NORTHD_NO_HV([
 AT_SETUP([Load balancer VIP in NAT entries])
 AT_SKIP_IF([test $HAVE_PYTHON = no])
-- 
2.43.0

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to