On Wed, Dec 20, 2023 at 4:28 PM Mohammad Heib <mh...@redhat.com> wrote:
> Until now, there has been no reliable for the CMS to detect when > changes made to the INB configuration have been passed through > to the ISB, This commit adds this feature to the system, > by adding sequence numbers to the INB and ISB and adding code > in ovn-ic-nbctl, ovn-ic to keep those sequence numbers up-to-date. > > The biggest user-visible change from this commit is a new option > '--wait' and new command 'sync' to ovn-ic-nbctl. > With --wait=sb, ovn-ic-nbctl now waits for ovn-ics to update the ISB > database. > > Signed-off-by: Mohammad Heib <mh...@redhat.com> > --- > Hi Mohammad, thank you for the series, I have a few comments down below. One small nit, the commit messages are a bit inconsistent, some patches have label IC, some ovn-ci. I guess all of this would fall into category ovn-ic with exception to the last patch. Also please include a cover letter for a series that has some common theme. > utilities/ovn-ic-nbctl.8.xml | 49 +++++++++++++++++++++++ > utilities/ovn-ic-nbctl.c | 77 +++++++++++++++++++++++++++++++++++- > 2 files changed, 124 insertions(+), 2 deletions(-) > > diff --git a/utilities/ovn-ic-nbctl.8.xml b/utilities/ovn-ic-nbctl.8.xml > index 4a70994b8..5a1324d2d 100644 > --- a/utilities/ovn-ic-nbctl.8.xml > +++ b/utilities/ovn-ic-nbctl.8.xml > @@ -123,9 +123,58 @@ > </dd> > </dl> > > + <h1>Synchronization Commands</h1> > + > + <dl> > + <dt>sync</dt> > + <dd> > + Ordinarily, <code>--wait=sb</code> only waits for changes by the > + current <code>ovn-ic-nbctl</code> invocation to take effect. > + This means that, if none of the commands supplied to > + <code>ovn-ic-nbctl</code> change the database, then the command > does > + not wait at all. With the <code>sync</code> command, however, > + <code>ovn-ic-nbctl</code> waits even for earlier changes to the > + database to propagate down to the southbound database, according > to the > + argument of <code>--wait</code>. > + </dd> > + </dl> > + > <h1>Options</h1> > > <dl> > + <dt><code>--no-wait</code> | <code>--wait=none</code></dt> > + <dt><code>--wait=sb</code></dt> > + > + <dd> > + <p> > + These options control whether and how <code>ovn-ic-nbctl</code> > waits > + for the OVN system to become up-to-date with changes made in an > + <code>ovn-ic-nbctl</code> invocation. > + </p> > + > + <p> > + By default, or if <code>--no-wait</code> or > <code>--wait=none</code>, > + <code>ovn-ic-nbctl</code> exits immediately after confirming > that > + changes have been committed to the Interconnect northbound > database, > + without waiting. > + </p> > + > + <p> > + With <code>--wait=sb</code>, before <code>ovn-ic-nbctl</code> > exits, > + it waits for <code>ovn-ics</code> to bring the Interconnect > + southbound database up-to-date with the Interconnect northbound > + database updates. > + </p> > + > + <p> > + Ordinarily, <code>--wait=sb</code> only waits for changes by the > + current <code>ovn-ic-nbctl</code> invocation to take effect. > + This means that, if none of the commands supplied to > + <code>ovn-ic-nbctl</code> change the database, then the command > + does not wait at all. > + Use the <code>sync</code> command to override this behavior. > + </p> > + </dd> > <dt><code>--db</code> <var>database</var></dt> > <dd> > The OVSDB database remote to contact. If the > <env>OVN_IC_NB_DB</env> > diff --git a/utilities/ovn-ic-nbctl.c b/utilities/ovn-ic-nbctl.c > index 721dc4586..7eca5bdc4 100644 > --- a/utilities/ovn-ic-nbctl.c > +++ b/utilities/ovn-ic-nbctl.c > @@ -58,6 +58,13 @@ static bool oneline; > /* --dry-run: Do not commit any changes. */ > static bool dry_run; > > +/* --wait=TYPE: Wait for configuration change to take effect? */ > +static enum nbctl_wait_type wait_type = NBCTL_WAIT_NONE; > + > +/* Should we wait (if specified by 'wait_type') even if the commands don't > + * change the database at all */ > +static bool force_wait = false; > + > /* --timeout: Time to wait for a connection to 'db'. */ > static unsigned int timeout; > > @@ -161,6 +168,8 @@ parse_options(int argc, char *argv[], struct shash > *local_options) > OPT_DB = UCHAR_MAX + 1, > OPT_ONELINE, > OPT_NO_SYSLOG, > + OPT_NO_WAIT, > + OPT_WAIT, > OPT_DRY_RUN, > OPT_LOCAL, > OPT_COMMANDS, > @@ -173,6 +182,8 @@ parse_options(int argc, char *argv[], struct shash > *local_options) > static const struct option global_long_options[] = { > {"db", required_argument, NULL, OPT_DB}, > {"no-syslog", no_argument, NULL, OPT_NO_SYSLOG}, > + {"no-wait", no_argument, NULL, OPT_NO_WAIT}, > + {"wait", required_argument, NULL, OPT_WAIT}, > {"dry-run", no_argument, NULL, OPT_DRY_RUN}, > {"oneline", no_argument, NULL, OPT_ONELINE}, > {"timeout", required_argument, NULL, 't'}, > @@ -234,7 +245,19 @@ parse_options(int argc, char *argv[], struct shash > *local_options) > case OPT_DRY_RUN: > dry_run = true; > break; > - > + case OPT_NO_WAIT: > + wait_type = NBCTL_WAIT_NONE; > + break; > + case OPT_WAIT: > + if (!strcmp(optarg, "none")) { > + wait_type = NBCTL_WAIT_NONE; > + } else if (!strcmp(optarg, "sb")) { > + wait_type = NBCTL_WAIT_SB; > + } else { > + ctl_fatal("argument to --wait must be " > + "\"none\", \"sb\" "); > + } > + break; > case OPT_LOCAL: > if (shash_find(local_options, options[idx].name)) { > ctl_fatal("'%s' option specified multiple times", > @@ -329,9 +352,14 @@ set the SSL configuration\n\ > %s\ > %s\ > \n\ > +Synchronization command (use with --wait=sb):\n\ > + sync wait even for earlier changes to take effect\n\ > +\n\ > Options:\n\ > --db=DATABASE connect to DATABASE\n\ > (default: %s)\n\ > + --no-wait, --wait=none do not wait for OVN reconfiguration > (default)\n\ > + --wait=sb wait for southbound database update\n\ > --no-leader-only accept any cluster member, not just the > leader\n\ > -t, --timeout=SECS wait at most SECS seconds\n\ > --dry-run do not commit changes to database\n\ > @@ -380,6 +408,12 @@ ic_nbctl_init(struct ctl_context *ctx OVS_UNUSED) > { > } > > +static void > +ic_nbctl_pre_sync(struct ctl_context *base OVS_UNUSED) > +{ > + force_wait = true; > +} > + > static void > ic_nbctl_ts_add(struct ctl_context *ctx) > { > @@ -748,6 +782,7 @@ do_ic_nbctl(const char *args, struct ctl_command > *commands, size_t n_commands, > struct ic_nbctl_context ic_nbctl_ctx; > struct ctl_command *c; > struct shash_node *node; > + int64_t next_cfg = 0; > > txn = the_idl_txn = ovsdb_idl_txn_create(idl); > if (dry_run) { > @@ -762,6 +797,17 @@ do_ic_nbctl(const char *args, struct ctl_command > *commands, size_t n_commands, > icnbrec_ic_nb_global_insert(txn); > } > > + if (wait_type != NBCTL_WAIT_NONE) { > + > + /* Deal with potential overflows. */ > + if (inb->nb_ic_cfg == LLONG_MAX) { > + icnbrec_ic_nb_global_set_nb_ic_cfg(inb, 0); > + } > + ovsdb_idl_txn_increment(txn, &inb->header_, > + &icnbrec_ic_nb_global_col_nb_ic_cfg, > + force_wait); > + } > + > symtab = ovsdb_symbol_table_create(); > for (c = commands; c < &commands[n_commands]; c++) { > ds_init(&c->output); > @@ -807,6 +853,9 @@ do_ic_nbctl(const char *args, struct ctl_command > *commands, size_t n_commands, > } > > status = ovsdb_idl_txn_commit_block(txn); > + if (wait_type != NBCTL_WAIT_NONE && status == TXN_SUCCESS) { > + next_cfg = ovsdb_idl_txn_get_increment_new_value(txn); > + } > This block could be moved to a separate function see down below. > if (status == TXN_UNCHANGED || status == TXN_SUCCESS) { > for (c = commands; c < &commands[n_commands]; c++) { > if (c->syntax->postprocess) { > @@ -884,6 +933,30 @@ do_ic_nbctl(const char *args, struct ctl_command > *commands, size_t n_commands, > shash_destroy_free_data(&c->options); > } > free(commands); > + > + if (wait_type == NBCTL_WAIT_NONE) { > + if (force_wait) { > + VLOG_INFO("\"sync\" command has no effect without --wait"); > + } > + goto done; > + } > + > + if (wait_type != NBCTL_WAIT_NONE && status != TXN_UNCHANGED) { > You are checking NBCTL_WAIT_NONE above, so this first condition will always be true. > + ovsdb_idl_enable_reconnect(idl); > + for (;;) { > + ovsdb_idl_run(idl); > + ICNBREC_IC_NB_GLOBAL_FOR_EACH (inb, idl) { > + int64_t cur_cfg = inb->sb_ic_cfg; > + if (cur_cfg >= next_cfg) { > This suffers from the overflow bug described by [0], we should do the same/similar thing here to ensure that we don't return sync before we are actually synced. > + goto done; > + } > + } > + ovsdb_idl_wait(idl); > + poll_block(); > + } > + done: ; > This done label is strange, there goto done outside of this if block, we could avoid the whole goto if we moved the logic above to a separate function. In that function you could just return. Also having a separate function for this will make it easier for future extensions e.g. printing wait time. > + } > + > ovsdb_idl_txn_destroy(txn); > ovsdb_idl_destroy(idl); > > @@ -924,7 +997,7 @@ ic_nbctl_exit(int status) > > static const struct ctl_command_syntax ic_nbctl_commands[] = { > { "init", 0, 0, "", NULL, ic_nbctl_init, NULL, "", RW }, > - > + { "sync", 0, 0, "", ic_nbctl_pre_sync, NULL, NULL, "", RO }, > /* transit switch commands. */ > { "ts-add", 1, 1, "SWITCH", NULL, ic_nbctl_ts_add, NULL, > "--may-exist", RW }, > { "ts-del", 1, 1, "SWITCH", NULL, ic_nbctl_ts_del, NULL, > "--if-exists", RW }, > -- > 2.34.3 > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Thanks, Ales [0] https://github.com/ovn-org/ovn/commit/c6515f5a -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> amu...@redhat.com <https://red.ht/sig> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev