Hi Ales,

I have some high-level comments about the change.

I did some research into how put_vport_bindings work. This commit explains the process very well: https://github.com/ovn-org/ovn/commit/3bc6e55a891b60f2abead99a063b7cccf2988907

In short, every bind_vport() action results in the virtual port being bound to the virtual parent twice. This is to cope with possible transaction errors that can happen when binding the virtual port.

With this new change, the first time that run_put_vport_binding() is called, we'll set a backoff of 1000 ms. Then pinctrl will wake up immediately, since the put_vport_binding is still in the hmap. Then we'll call run_put_vport_binding() again. After this second time, the put_vport_binding is removed from the hmap.

On a small system, this will mean that the second call to run_put_vport_binding() will happen before the backoff has expired, meaning the second run of run_put_vport_binding() is a waste.

On a loaded system, where ovn-controller loops regularly take multiple seconds, then this second call to run_put_vport_binding() will always happen after the backoff has expired, therefore, setting the backoff timer was not really worth it on the first run.

In other words, I think that the two approaches are at odds with each other. On one hand, we want to attempt to bind the port twice in quick succession as a way of beating the odds of encountering SB transaction error. On the other, we want to ensure that virtual port claims are spread out enough so as not to cause excessive churn.

I have a couple of suggestions:
1) Always check if the backoff timer has expired before attempting to claim the port. However, only reset the backoff timer if vpb->new_record is false. This way, we still attempt to bind the port twice in quick succession. But then further calls to bind_vport() for the virtual port will be backed off properly.

2) Get rid of the behavior where we attempt to bind the virtual port twice every time. Since you've added a new "vport-last-claimed" timestamp to Port_Binding, it means we can reliably tell when the last time that a virtual port was successfully claimed. Before claiming the virtual port, we can store in memory the current vport-last-claimed timestamp. Then we can attempt to claim the port. We'll keep the put_vport_binding in our hmap. Then on the next loop, if the SB vport-last-claimed value is still the same as our stored vport-last-claimed timestamp, we know our attempt failed and we should try again immediately, once again keeping the put_vport_binding in our hmap. If the SB vport-last-claimed value is different, we know that an attempt to claim the port was successful (it may be us or a different chassis). We should engage the backoff timer and remove the put_vport_binding from our hmap.

(1) is the easier choice, but I think (2) is the more correct choice.

What do you think?

Aside from that, I have a small comment below.

On 3/25/25 10:17, Ales Musil via dev wrote:
Limit the number of claim for virtual ports to 1/s with backoff
period of 10 seconds. This should help with cluster stabilization
during fail-overs. It can happen that during fail-over there will be
a lot of attempts to claim the virtual ports. With the backoff we can
ensure that the cluster had enough time to try to stabilize the
fail-overs.

Reported-at: https://issues.redhat.com/browse/FDP-443
Signed-off-by: Ales Musil <[email protected]>
---
  controller/pinctrl.c |  59 +++++++++++++++++++----
  northd/northd.c      |  15 ++++++
  tests/ovn.at         | 108 ++++++++++++++++++++++++++++++++++++++++++-
  3 files changed, 172 insertions(+), 10 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 1481b2547..2d8536c4a 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -7365,6 +7365,9 @@ pinctrl_find_put_vport_binding(uint32_t dp_key, uint32_t 
vport_key,
      return NULL;
  }
+#define VPORT_BACKOFF_INTERVAL 1000ll
+#define VPORT_MAX_POSTPONE 10000ll
+
  static void
  run_put_vport_binding(struct ovsdb_idl_txn *ovnsb_idl_txn OVS_UNUSED,
                        struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
@@ -7385,18 +7388,56 @@ run_put_vport_binding(struct ovsdb_idl_txn 
*ovnsb_idl_txn OVS_UNUSED,
      }
/* pinctrl module updates the port binding only for type 'virtual'. */
-    if (!strcmp(pb->type, "virtual")) {
-        const struct sbrec_port_binding *parent = lport_lookup_by_key(
+    if (strcmp(pb->type, "virtual")) {
+        return;
+    }
+
+    const struct sbrec_port_binding *parent = lport_lookup_by_key(
          sbrec_datapath_binding_by_key, sbrec_port_binding_by_key,
          vpb->dp_key, vpb->vport_parent_key);
-        if (parent) {
-            VLOG_INFO("Claiming virtual lport %s for this chassis "
-                       "with the virtual parent %s",
-                       pb->logical_port, parent->logical_port);
-            sbrec_port_binding_set_chassis(pb, chassis);
-            sbrec_port_binding_set_virtual_parent(pb, parent->logical_port);
-        }
+    if (!parent) {
+        return;
      }
+
+    if (pb->chassis == chassis && !strcmp(pb->virtual_parent,
+                                          parent->logical_port)) {
+        VLOG_DBG("Virtual port %s is already claimed by this chassis.",
+                 pb->logical_port);
+        return;
+    }
+
+    long long timewall_now = time_wall_msec();
+    long long last_claim = ovn_smap_get_llong(&pb->options,
+                                              "vport-last-claim", 0);
+    long long backoff = ovn_smap_get_llong(&pb->options, "vport-backoff", 0);
+
+    struct smap options;
+    smap_clone(&options, &pb->options);
+
+    long long next_backoff;
+    if (timewall_now >= last_claim + backoff) {
+        sbrec_port_binding_set_chassis(pb, chassis);
+        sbrec_port_binding_set_virtual_parent(pb, parent->logical_port);
+
+        next_backoff = VPORT_BACKOFF_INTERVAL;
+        smap_remove(&options, "vport-last-claim");
+        smap_add_format(&options, "vport-last-claim", "%lld", timewall_now);
+
+        VLOG_INFO("Claiming virtual lport %s for this chassis "
+                  "with the virtual parent %s",
+                  pb->logical_port, parent->logical_port);
+    } else {
+        next_backoff = MIN(backoff + VPORT_BACKOFF_INTERVAL,
+                           VPORT_MAX_POSTPONE);
+        VLOG_INFO("Postponing virtual lport %s claim by %lld ms",
+                  pb->logical_port, next_backoff);
+    }
+
+    smap_remove(&options, "vport-backoff");
+    smap_add_format(&options, "vport-backoff", "%lld", next_backoff); > +
+    sbrec_port_binding_set_options(pb, &options);
+    smap_destroy(&options);
  }
/* Called by pinctrl_run(). Runs with in the main ovn-controller
diff --git a/northd/northd.c b/northd/northd.c
index 1d3e132d4..eb940c277 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3332,6 +3332,21 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
                  }
              }
+ /* Preserve virtual port options. */
+            if (!strcmp(op->nbsp->type, "virtual")) {
+                const char *last_claim = smap_get(&op->sb->options,
+                                                  "vport-last-claim");
+                if (last_claim) {
+                    smap_add(&options, "vport-last-claim", last_claim);
+                }
+
+                const char *backoff = smap_get(&op->sb->options,
+                                               "vport-backoff");
+                if (backoff) {
+                    smap_add(&options, "vport-backoff", backoff);
+                }
+            }
+
              if (lsp_is_remote(op->nbsp) ||
                  op->sb->requested_additional_chassis) {
                  /* ovn-northd is supposed to set port_binding for remote ports
diff --git a/tests/ovn.at b/tests/ovn.at
index afde2576f..3c9420d11 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -22766,7 +22766,7 @@ ovs-vsctl -- add-port br-int hv1-vif3 -- \
      options:rxq_pcap=hv1/vif3-rx.pcap \
      ofport-request=3
-ovs-appctl -t ovn-controller vlog/set dbg
+ovs-appctl -t ovn-controller vlog/set pinctrl:dbg
sim_add hv2
  as hv2
@@ -23002,6 +23002,8 @@ check_virtual_offlows_not_present hv1
  # hv2 should not have the flow for ACL.
  check_virtual_offlows_not_present hv2
+# Remove the backoff period, to not prolongue the test.

s/prolongue/prolong/

I wondered if this was an American vs. British difference in spelling, but both Webster's and the OED say it's "prolong".

+ovn-sbctl remove Port_Binding sw0-vir options vport-backoff
  # From sw0-p0 resend GARP for 10.0.0.10. hv1 should reclaim sw0-vir
  # and sw0-p1 should be its virtual_parent.
  send_garp hv1 hv1-vif1 1 $(hex_to_mac $eth_src) $(hex_to_mac $eth_dst) $spa 
$tpa
@@ -23046,6 +23048,8 @@ check_virtual_offlows_not_present hv1
  # hv2 should not have the above flows.
  check_virtual_offlows_not_present hv2
+# Wait for the backoff period to pass.
+sleep 1.2
  # From sw0-p0 send GARP for 10.0.0.10. hv1 should claim sw0-vir
  # and sw0-p1 should be its virtual_parent.
  eth_src="50:54:00:00:00:03"
@@ -23066,6 +23070,8 @@ check_virtual_offlows_present hv1
  # hv2 should not have the above flows.
  check_virtual_offlows_not_present hv2
+# Remove the backoff period, to not prolongue the test.
+ovn-sbctl remove Port_Binding sw0-vir options vport-backoff
  # From sw0-p3 send GARP for 10.0.0.10. hv1 should claim sw0-vir
  # and sw0-p3 should be its virtual_parent.
  eth_src="50:54:00:00:00:05"
@@ -23094,6 +23100,8 @@ check_virtual_offlows_present hv1
  # hv2 should not have the above flows.
  check_virtual_offlows_not_present hv2
+# Remove the backoff period, to not prolongue the test.
+ovn-sbctl remove Port_Binding sw0-vir options vport-backoff
  # send the garp from sw0-p2 (in hv2). hv2 should claim sw0-vir
  # and sw0-p2 shpuld be its virtual_parent.
  eth_src="50:54:00:00:00:04"
@@ -23121,6 +23129,8 @@ check_virtual_offlows_present hv2
  # hv1 should not have the above flows.
  check_virtual_offlows_not_present hv1
+# Remove the backoff period, to not prolongue the test.
+ovn-sbctl remove Port_Binding sw0-vir options vport-backoff
  # Now send arp reply from sw0-p1. hv1 should claim sw0-vir
  # and sw0-p1 shpuld be its virtual_parent.
  eth_src=505400000003
@@ -42500,3 +42510,99 @@ OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv2/vif1-tx.pcap], 
[expected])
  OVN_CLEANUP([hv1], [hv2])
  AT_CLEANUP
  ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([virtual port claim postpone])
+AT_KEYWORDS([virtual ports])
+ovn_start
+
+net_add n1
+
+sim_add hv1
+as hv1
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+check ovs-vsctl -- add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=sw0-p1 \
+    options:tx_pcap=hv1/vif1-tx.pcap \
+    options:rxq_pcap=hv1/vif1-rx.pcap
+
+sim_add hv2
+as hv2
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.2
+check ovs-vsctl -- add-port br-int hv2-vif1 -- \
+    set interface hv2-vif1 external-ids:iface-id=sw0-p2 \
+    options:tx_pcap=hv2/vif1-tx.pcap \
+    options:rxq_pcap=hv2/vif1-rx.pcap
+
+sim_add hv3
+as hv3
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.3
+check ovs-vsctl -- add-port br-int hv3-vif1 -- \
+    set interface hv3-vif1 external-ids:iface-id=sw0-p3 \
+    options:tx_pcap=hv3/vif1-tx.pcap \
+    options:rxq_pcap=hv3/vif1-rx.pcap
+
+check ovn-nbctl ls-add sw0
+
+check ovn-nbctl lsp-add sw0 sw0-p1
+check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:01 10.0.0.1"
+check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:01 10.0.0.1 
10.0.0.10"
+
+check ovn-nbctl lsp-add sw0 sw0-p2
+check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:02 10.0.0.2"
+check ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:02 10.0.0.2 
10.0.0.10"
+
+check ovn-nbctl lsp-add sw0 sw0-p3
+check ovn-nbctl lsp-set-addresses sw0-p3 "50:54:00:00:00:03 10.0.0.3"
+check ovn-nbctl lsp-set-port-security sw0-p3 "50:54:00:00:00:03 10.0.0.3 
10.0.0.10"
+
+check ovn-nbctl lsp-add sw0 sw0-vir
+check ovn-nbctl lsp-set-addresses sw0-vir "50:54:00:00:00:10 10.0.0.10"
+check ovn-nbctl lsp-set-type sw0-vir virtual
+check ovn-nbctl set logical_switch_port sw0-vir options:virtual-ip=10.0.0.10
+check ovn-nbctl set logical_switch_port sw0-vir 
options:virtual-parents="sw0-p1,sw0-p2,sw0-p3"
+
+wait_for_ports_up
+check ovn-nbctl --wait=hv sync
+
+# Claim virtual port on hv1.
+send_garp hv1 hv1-vif1 1 "50:54:00:00:00:01" "ff:ff:ff:ff:ff:ff" 10.0.0.10 
10.0.0.10
+wait_row_count Port_Binding 1 logical_port=sw0-vir chassis=$(fetch_column 
Chassis _uuid name=hv1)
+wait_row_count Port_Binding 1 logical_port=sw0-vir virtual_parent=sw0-p1
+wait_for_ports_up sw0-vir
+
+# Check that the backoff interval was set.
+AT_CHECK([fetch_column Port_Binding options logical_port=sw0-vir | grep -q 
"vport-last-claim"])
+check_row_count Port_Binding 1 logical_port=sw0-vir options:vport-backoff=1000
+
+# Pass the backoff timer
+sleep 1.2
+
+# Try to claim on all chassis in quick succession.
+for i in $(seq 1 10); do
+    send_garp hv3 hv3-vif1 1 "50:54:00:00:00:03" "ff:ff:ff:ff:ff:ff" 10.0.0.10 
10.0.0.10
+    send_garp hv2 hv2-vif1 1 "50:54:00:00:00:02" "ff:ff:ff:ff:ff:ff" 10.0.0.10 
10.0.0.10
+    send_garp hv1 hv1-vif1 1 "50:54:00:00:00:01" "ff:ff:ff:ff:ff:ff" 10.0.0.10 
10.0.0.10
+    sleep 0.2
+done
+
+wait_row_count Port_Binding 1 logical_port=sw0-vir chassis=$(fetch_column 
Chassis _uuid name=hv3)
+wait_row_count Port_Binding 1 logical_port=sw0-vir virtual_parent=sw0-p3
+wait_for_ports_up sw0-vir
+
+# Check that the backoff interval reached the maximum.
+AT_CHECK([fetch_column Port_Binding options logical_port=sw0-vir | grep -q 
"vport-last-claim"])
+check_row_count Port_Binding 1 logical_port=sw0-vir options:vport-backoff=10000
+
+AT_CHECK([grep -q "Postponing virtual lport sw0-vir" hv1/ovn-controller.log ])
+AT_CHECK([grep -q "Postponing virtual lport sw0-vir" hv2/ovn-controller.log ])
+
+# hv3 has the claim already, it doesn't bump the backoff interval.
+AT_CHECK([grep -q "Postponing virtual lport sw0-vir" hv3/ovn-controller.log], 
[1])
+
+OVN_CLEANUP([hv1])
+AT_CLEANUP
+])

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to