Hi Ales,

I have some high-level comments/questions about this patch.

I have been privy to the conversations that led to this change. My understanding is that by having ovn-northd wake up immediately, it can be more CPU-intensive than waiting a bit for changes to accumulate and handling all of those at once instead. However, nothing in either the commit message or ovn-nb.xml explains what the purpose of this new configuration option is. I think you should add a sentence or two to explain why someone would want to enable this option.

Next, the algorithm used here strikes me as odd. We use the previous run time of ovn-northd to determine how long to wait before running again. This delay is capped by the configured backoff time. Let's say that we've configured the backoff interval to be 200 ms. If ovn-northd has a super quick run and only takes 10 ms, then we will only delay the next run by 10 ms. IMO, this seems like it would not mitigate the original issue by much, since we are only allowing a maximum of 20 ms (10 ms for the run of ovn-northd + 10 ms delay) of NB changes to accumulate. Conversely, if northd has a huge recompute and it takes 5000 ms to complete, then we would delay the next run by 200ms. In this case, delaying at all seems like it's not necessary since we potentially have 5000 ms worth of NB DB updates that have not been addressed. I would have expected the opposite approach to be taken. If someone configures 200ms as their backoff interval, I would expect us to always allow a *minimum* of 200ms of NB changes to accumulate before running again. So for instance, if northd runs quickly and is done in 10 ms, then we would wait 200 - 10 = 190 ms before processing changes again. If northd takes a long time to recompute and takes 5000 ms, then we would not wait at all before processing changes again. Was the algorithm chosen based on experimentation? Is it a well-known method I'm just not familiar with?

Next, I notice that you've added new poll_timer_wait() calls but haven't changed the ovsdb_idl_loop_run() or ovsdb_idl_loop_commit_and_wait() calls. Is there any danger of ovn-northd getting in a busy loop of sleeping and waking because of this? I don't think it should, since presumably ovsdb_idl_loop_run() should clear the conditions waited on by ovsdb_idl_loop_commit_and_wait(), but I want to double-check.

Next, does this have any negative impact on our ability to perform incremental processing in ovn-northd? My concern is that since we are still running the ovsdb IDL loop that if multiple NB changes occur during our delay, then we might have to fall back to a full recompute instead of being able to incrementally process the changes. Are my concerns valid?

Next, has scale testing shown that this change has made a positive impact? If so, is there any recommendation for how to determine what to configure the value to?

Finally, is this change expected to be a long-term necessity? This option seems to be useful for cases where northd recomputes are required. Performing recomputes less frequently seems like it would lower the CPU usage of ovn-northd while still processing the same amount of changes. However, once northd can handle most changes incrementally, is there still a benefit to delaying running? If each run of northd handles all DB changes incrementally, then is there any point in putting delays between those incremental runs?

On 8/9/23 01:29, Ales Musil wrote:
Add config option called "northd-backoff-interval-ms" that allows
to delay northd engine runs capped by the config option.
When the config option is set to 0 or unspecified, the engine
will run without any restrictions. If the value >0 we will delay
northd engine run by the previous run time capped by the
config option.

Signed-off-by: Ales Musil <amu...@redhat.com>
---
  NEWS                     |  2 ++
  northd/inc-proc-northd.c | 23 ++++++++++++++++++++++-
  northd/inc-proc-northd.h |  3 ++-
  northd/ovn-northd.c      |  9 +++++++--
  ovn-nb.xml               |  7 +++++++
  5 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 8275877f9..6109f13a2 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,8 @@ Post v23.06.0
    - To allow optimizing ovn-controller's monitor conditions for the regular
      VIF case, ovn-controller now unconditionally monitors all sub-ports
      (ports with parent_port set).
+  - Add "northd-backoff-interval-ms" config option to delay northd engine
+    runs capped at the set value.
OVN v23.06.0 - 01 Jun 2023
  --------------------------
diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
index d328deb22..87db50ad1 100644
--- a/northd/inc-proc-northd.c
+++ b/northd/inc-proc-northd.c
@@ -42,6 +42,8 @@ VLOG_DEFINE_THIS_MODULE(inc_proc_northd);
static unixctl_cb_func chassis_features_list; +static int64_t next_northd_run_ms = 0;
+
  #define NB_NODES \
      NB_NODE(nb_global, "nb_global") \
      NB_NODE(logical_switch, "logical_switch") \
@@ -295,8 +297,10 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
  /* Returns true if the incremental processing ended up updating nodes. */
  bool inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn,
                           struct ovsdb_idl_txn *ovnsb_txn,
-                         bool recompute) {
+                         bool recompute, uint32_t backoff_ms) {
      ovs_assert(ovnnb_txn && ovnsb_txn);
+
+    int64_t start = time_msec();
      engine_init_run();
/* Force a full recompute if instructed to, for example, after a NB/SB
@@ -330,6 +334,12 @@ bool inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn,
      } else {
          engine_set_force_recompute(false);
      }
+
+    int64_t now = time_msec();
+    /* Postpone the next run by length of current run with maximum capped
+     * by "northd-backoff-interval-ms" interval. */
+    next_northd_run_ms = now + MIN(now - start, backoff_ms);
+
      return engine_has_updated();
  }
@@ -339,6 +349,17 @@ void inc_proc_northd_cleanup(void)
      engine_set_context(NULL);
  }
+bool
+inc_proc_northd_can_run(bool recompute)
+{
+    if (recompute || time_msec() >= next_northd_run_ms) {
+        return true;
+    }
+
+    poll_timer_wait_until(next_northd_run_ms);
+    return false;
+}
+
  static void
  chassis_features_list(struct unixctl_conn *conn, int argc OVS_UNUSED,
                        const char *argv[] OVS_UNUSED, void *features_)
diff --git a/northd/inc-proc-northd.h b/northd/inc-proc-northd.h
index 9b81c7ee0..af418d7d7 100644
--- a/northd/inc-proc-northd.h
+++ b/northd/inc-proc-northd.h
@@ -10,7 +10,8 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
                            struct ovsdb_idl_loop *sb);
  bool inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn,
                           struct ovsdb_idl_txn *ovnsb_txn,
-                         bool recompute);
+                         bool recompute, uint32_t backoff_ms);
  void inc_proc_northd_cleanup(void);
+bool inc_proc_northd_can_run(bool recompute);
#endif /* INC_PROC_NORTHD */
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 4fa1b039e..3202b50a1 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -868,6 +868,7 @@ main(int argc, char *argv[])
      /* Main loop. */
      exiting = false;
+ uint32_t northd_backoff_ms = 0;
      bool recompute = false;
      while (!exiting) {
          update_ssl_config();
@@ -932,10 +933,12 @@ main(int argc, char *argv[])
if (ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) {
                  bool activity = false;
-                if (ovnnb_txn && ovnsb_txn) {
+                if (ovnnb_txn && ovnsb_txn &&
+                    inc_proc_northd_can_run(recompute)) {
                      int64_t loop_start_time = time_wall_msec();
                      activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn,
-                                                        recompute);
+                                                   recompute,
+                                                   northd_backoff_ms);
                      recompute = false;
                      check_and_add_supported_dhcp_opts_to_sb_db(
                                   ovnsb_txn, ovnsb_idl_loop.idl);
@@ -1019,6 +1022,8 @@ main(int argc, char *argv[])
          if (nb) {
              interval = smap_get_int(&nb->options, "northd_probe_interval",
                                      interval);
+            northd_backoff_ms = smap_get_uint(&nb->options,
+                                              "northd-backoff-interval-ms", 0);
          }
          set_idl_probe_interval(ovnnb_idl_loop.idl, ovnnb_db, interval);
          set_idl_probe_interval(ovnsb_idl_loop.idl, ovnsb_db, interval);
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 4fbf4f7e5..115dfd536 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -349,6 +349,13 @@
          of HWOL compatibility with GDP.
        </column>
+ <column name="options" key="northd-backoff-interval-ms">
+        Maximum interval that the northd incremental engine is delayed by
+        in milliseconds. Setting the value to nonzero delays the next northd
+        engine run by the previous run time, capped by the specified value.
+        If the value is zero the engine won't be delayed at all.
+      </column>
+
        <group title="Options for configuring interconnection route 
advertisement">
          <p>
            These options control how routes are advertised between OVN

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to