According to the ovn-architecture(7) sb_cfg counter should be
propagated at the moment northd completed transaction of related changes
to the southbound db. However, a processing engine call happened right
between two events, a sb transaction commitment, and initiating
the sb_cfg write. Normally, that processing engine call has nothing
to do, because it has only update from sb db that fires back result
of recent transaction from northd. But if, by some reason, engine
change handler falls into full recompute there will be big delay
before sb_cfg propagated to nb db, and in fact it really happened
in old release, for example 22.09.

The fix defers engine run for one iteration (of main loop)
if we have sb_cfg ready to propagate.

Signed-off-by: Aleksandr Smirnov <[email protected]>
---
v2: Address Mark's comments.
v3: Address Vladislav's comments.
---
 northd/ovn-northd.c | 22 ++++++++++++++++++++--
 tests/ovn-northd.at | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 72eedbfdb..68a31d0b0 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -798,6 +798,15 @@ run_memory_trimmer(struct ovsdb_idl *ovnnb_idl, bool 
activity)
     memory_trimmer_wait(mt);
 }
 
+static bool
+sb_cfg_is_in_sync(struct ovsdb_idl *ovnnb_idl,
+                  struct ovsdb_idl_loop *sb_loop)
+{
+    const struct nbrec_nb_global *nb = nbrec_nb_global_first(ovnnb_idl);
+
+    return !(nb && sb_loop->cur_cfg && nb->sb_cfg != sb_loop->cur_cfg);
+}
+
 int
 main(int argc, char *argv[])
 {
@@ -1064,8 +1073,17 @@ main(int argc, char *argv[])
                 if (ovnnb_txn && ovnsb_txn &&
                     inc_proc_northd_can_run(&eng_ctx)) {
                     int64_t loop_start_time = time_wall_msec();
-                    activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn,
-                                                   &eng_ctx);
+
+                    if (sb_cfg_is_in_sync(ovnnb_idl_loop.idl,
+                                          &ovnsb_idl_loop)) {
+                        activity = inc_proc_northd_run(ovnnb_txn,
+                                                       ovnsb_txn,
+                                                       &eng_ctx);
+                    } else {
+                        poll_immediate_wake();
+                        clear_idl_track = false;
+                    }
+
                     check_and_add_supported_dhcp_opts_to_sb_db(
                                  ovnsb_txn, ovnsb_idl_loop.idl);
                     check_and_add_supported_dhcpv6_opts_to_sb_db(
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index cfaba19bf..5db6d4984 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -16748,3 +16748,40 @@ check grep -q "Bad configuration: The peer of the 
switch port 'ls-lrp1' (LRP pee
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([sb_cfg propagation])
+ovn_start
+
+#
+# Test engine call does not happen between sb db transaction
+# commitment and sb_cfg write to the nb db (if have to)
+#
+> northd/ovn-northd.log
+check as northd ovn-appctl -t ovn-northd vlog/set jsonrpc:dbg
+check as northd ovn-appctl -t ovn-northd vlog/set inc_proc_eng:dbg
+# Any change that invoke engine processing, for example add logical switch.
+# nb_cfg bump must be present in order to get feedback as sb_cfg.
+check ovn-nbctl --wait=sb ls-add sw1
+#
+# Get following messages from log file:
+# 'unix... transact ... Southbound' --> Indicates the pack of updates has been
+#                                       committed to the sb db.
+# 'unix... transact ... sb_cfg'  --> Indicates the sb_cfg has been committed to
+#                                    the sb db.
+# 'Initializing new run' --> Indicates the engine has been called.
+#
+# Then, take first letter of messages, here 'u' or 'I' and form them into one
+# string.
+#
+# Finally, a 'good' string should have two 'u' conjuncted with several 'I'
+# both as prefix and suffix, while 'bad' string will have 'I' between two 'u'.
+#
+call_seq=`grep -E \
+ "(transact.*sb_cfg)|(transact.*Southbound)|(Initializing new run)" \
+ northd/ovn-northd.log | cut -d "|" -f 5- | cut -b 1 | tr -d '\n'`
+
+AT_CHECK([echo $call_seq | grep -qE "^I*uuI*$"], [0])
+
+AT_CLEANUP
+])
-- 
2.47.0

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to