On 5/3/23 16:12, Vladislav Odintsov wrote: > Hi Dumitru and Mark, > > thanks for the review! > >> On 3 May 2023, at 16:56, Dumitru Ceara <dce...@redhat.com> wrote: >> >> On 5/3/23 15:47, Mark Michelson wrote: >>> On 5/3/23 09:03, Dumitru Ceara wrote: >>>> Hi Vladislav, >>>> >>>> On 5/3/23 02:55, Vladislav Odintsov wrote: >>>>> For large OVN_Southbound (or other) databases the default interval of >>>>> 5000 ms >>>>> could be not sufficient to run. This patch adds configuration of OVSDB >>>>> inactivity probes for ovn-*ctl utilities. >>>>> >>>>> Initially, on the utility start the hardcoded value of 120000 ms is set. >>>>> For daemon-mode it is possible to configure intervals in OVN Northbound >>>>> database: >>>>> OVN_Northbound.NB_Global.options.dbctl_inactivity_probe for ovn-nbctl >>>>> and >>>>> ovn-sbctl utilities. >>>>> >>>>> If no DB configuration option was provided, the initial (120000 ms) >>>>> interval >>>>> is left. >>>> >>>> Nit: I would add here something like "A non-zero inactivity probe >>>> interval is required in order to allow DB clients to detect connection >>>> failures due to network issues." >>>> >>>>> >>>>> Signed-off-by: Vladislav Odintsov <odiv...@gmail.com> >>>>> --- >>>>> lib/ovn-util.h | 2 ++ >>>>> ovn-nb.xml | 20 ++++++++++++++++++++ >>>>> utilities/ovn-dbctl.c | 18 ++++++++++++++++++ >>>>> utilities/ovn-dbctl.h | 1 + >>>>> utilities/ovn-ic-nbctl.c | 4 ++++ >>>>> utilities/ovn-ic-sbctl.c | 4 ++++ >>>>> utilities/ovn-nbctl.c | 15 +++++++++++++++ >>>>> utilities/ovn-sbctl.c | 15 +++++++++++++++ >>>> >>>> I'm not sure whether we should we mention this user visible change in >>>> the NEWS file. >>> >>> I think it's worth mentioning in the NEWS file. >>> >>>> >>>>> 8 files changed, 79 insertions(+) >>>>> >>>>> diff --git a/lib/ovn-util.h b/lib/ovn-util.h >>>>> index 7cf861dbc..47e53ca74 100644 >>>>> --- a/lib/ovn-util.h >>>>> +++ b/lib/ovn-util.h >>>>> @@ -154,6 +154,8 @@ void set_idl_probe_interval(struct ovsdb_idl >>>>> *idl, const char *remote, >>>>> #define OVN_MAX_DP_VXLAN_KEY ((1u << 12) - 1) >>>>> #define OVN_MAX_DP_VXLAN_KEY_LOCAL (OVN_MAX_DP_KEY - >>>>> OVN_MAX_DP_GLOBAL_NUM) >>>>> +#define DEFAULT_UTILS_PROBE_INTERVAL_MSEC 120000 >>>> >>>> I think this should go to ovn-dbctl.h instead but whoever applies the >>>> patch can move this. So I don't think there's a need for v3, therefore: >>> >>> I also agree with this suggestion. > > I’m also fine with this change and have no objection to fix this on the apply > time if needed. > >>> >>>> >>>> Acked-by: Dumitru Ceara <dce...@redhat.com> >>>> >>>> Ilya, Mark, Numan, Han, what do you guys think? >>> >>> Before I can ack this patch, I want to bring up two points: >>> >>> 1) This patch updates ovn-nb.xml, but it does not update ovn-sb.xml to >>> document the new SB option. >>> >> >> We already don't document everything that applies to both. >> >>> 2) I'm not a big fan of the name "dbctl_probe_interval", because the >>> term "dbctl" is not well-known to most OVN users. My suggestion would be >>> to call the NB option "nbctl_probe_interval" or >>> "ovn_nbctl_probe_interval" and to call the SB option >>> "sbctl_probe_interval" or "ovn_sbctl_probe_interval". This makes it more >>> clear that the option pertains to the "ovn-nbctl" and "ovn-sbctl" >>> utilities. However, if others disagree with me on this, then we can use >>> "dbctl_probe_interval" . >>> >> >> The problem is northd propagates all NB.NB_Global.options to >> SB.SB_Global.options. > > Yes, this is due to the facts of northd logic for propagation options from > NB_Global to SB_Global table. > >> >>> I also have a couple of small findings below. >>> >>>> >>>> Regards, >>>> Dumitru >>>> >>>>> + >>>>> struct hmap; >>>>> void ovn_destroy_tnlids(struct hmap *tnlids); >>>>> bool ovn_add_tnlid(struct hmap *set, uint32_t tnlid); >>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml >>>>> index fd32070f2..a82f31a92 100644 >>>>> --- a/ovn-nb.xml >>>>> +++ b/ovn-nb.xml >>>>> @@ -215,6 +215,26 @@ >>>>> </p> >>>>> </column> >>>>> + <column name="options" key="dbctl_probe_interval"> >>>>> + <p> >>>>> + The inactivity probe interval of the connection to the OVN >>>>> Northbound >>>>> + and Southbound databases from <code>ovn-nbctl</code> and >>>>> + <code>ovn-sbctl</code> utilities respectively, in >>>>> milliseconds. >>>>> + If the value is zero, it disables the connection keepalive >>>>> feature. >>>>> + </p> >>> >>> The documentation makes it sound as though the NB dbctl_probe_interval >>> affects *both* ovn-nbctl and ovn-sbctl. However, this only affects >>> ovn-nbctl. ovn-sbctl is controlled by the SB database. >>> >> >> Due to the fact that the option is propagated by northd it eventually >> applies to both. But thinking some more about it, you're right, it's >> probably better to add explicit nb and sb options. > > This was my first approach, where I tried to read nbctl_… and sbctl_… options > from appropriate databases, but faced with northd syncing (and deletion > option from SB). > I decided to make a common option thinking that for northd there is similar > logic where NB and SB connections are configured from one option > (northd_probe_interval). > > You you think it’s worth to split options, so I need some time find out how > to prevent northd from syncing custom option from NB to SB and how to make it > not to delete new option from SB. >
It's not really great but we already do that for "lb_hairpin_use_ct_mark". Maybe something similar? >> >>>>> + >>>>> + <p> >>>>> + If the value is nonzero, then it will be forced to a value of >>>>> + at least 1000 ms. >>>>> + </p> >>>>> + >>>>> + <p> >>>>> + If the value less then zero, then the default inactivity >>>>> probe >>> >>> "If the value is less than zero" >>> >>>>> + interval for <code>ovn-nbctl</code> and >>>>> <code>ovn-sbctl</code> would >>>>> + be left intact (120000 ms). >>>>> + </p> >>>>> + </column> >>>>> + >>>>> <column name="options" key="northd_trim_timeout"> >>>>> <p> >>>>> When used, this configuration value specifies the time, in >>>>> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c >>>>> index 369a6a663..c6aebcbbb 100644 >>>>> --- a/utilities/ovn-dbctl.c >>>>> +++ b/utilities/ovn-dbctl.c >>>>> @@ -205,6 +205,9 @@ ovn_dbctl_main(int argc, char *argv[], >>>>> ovsdb_idl_set_remote(idl, db, daemon_mode); >>>>> ovsdb_idl_set_leader_only(idl, leader_only); >>>>> + /* Set reasonable high probe interval. */ >>>>> + set_idl_probe_interval(idl, db, DEFAULT_UTILS_PROBE_INTERVAL_MSEC); >>>>> + >>>>> if (daemon_mode) { >>>>> server_loop(dbctl_options, idl, argc, argv_); >>>>> } else { >>>>> @@ -1094,6 +1097,17 @@ out: >>>>> free(argv); >>>>> } >>>>> +static void >>>>> +update_inactivity_probe(struct server_cmd_run_ctx *ctx) >>>>> +{ >>>>> + int inactivity_probe = >>>>> ctx->dbctl_options->get_inactivity_probe(ctx->idl); >>>>> + if (inactivity_probe < 0) { >>>>> + inactivity_probe = DEFAULT_UTILS_PROBE_INTERVAL_MSEC; >>>>> + } >>>>> + >>>>> + set_idl_probe_interval(ctx->idl, db, inactivity_probe); >>>>> +} >>>>> + >>>>> static void >>>>> server_loop(const struct ovn_dbctl_options *dbctl_options, >>>>> struct ovsdb_idl *idl, int argc, char *argv[]) >>>>> @@ -1125,6 +1139,10 @@ server_loop(const struct ovn_dbctl_options >>>>> *dbctl_options, >>>>> for (;;) { >>>>> update_ssl_config(); >>>>> + >>>>> + /* Configure inactivity probe from connected DB. */ >>>>> + update_inactivity_probe(&server_cmd_run_ctx); >>>>> + >>>>> memory_run(); >>>>> if (memory_should_report()) { >>>>> struct simap usage = SIMAP_INITIALIZER(&usage); >>>>> diff --git a/utilities/ovn-dbctl.h b/utilities/ovn-dbctl.h >>>>> index a1fbede6b..54c232dd6 100644 >>>>> --- a/utilities/ovn-dbctl.h >>>>> +++ b/utilities/ovn-dbctl.h >>>>> @@ -52,6 +52,7 @@ struct ovn_dbctl_options { >>>>> const struct timer *wait_timeout, >>>>> long long int start_time, bool >>>>> print_wait_time); >>>>> + int (*get_inactivity_probe)(struct ovsdb_idl *); >>>>> struct ctl_context *(*ctx_create)(void); >>>>> void (*ctx_destroy)(struct ctl_context *); >>>>> }; >>>>> diff --git a/utilities/ovn-ic-nbctl.c b/utilities/ovn-ic-nbctl.c >>>>> index f3d8039a8..19d8b054f 100644 >>>>> --- a/utilities/ovn-ic-nbctl.c >>>>> +++ b/utilities/ovn-ic-nbctl.c >>>>> @@ -116,6 +116,10 @@ main(int argc, char *argv[]) >>>>> ovsdb_idl_set_remote(idl, db, false); >>>>> ovsdb_idl_set_db_change_aware(idl, false); >>>>> ovsdb_idl_set_leader_only(idl, leader_only); >>>>> + >>>>> + /* Set reasonable high probe interval. */ >>>>> + set_idl_probe_interval(idl, db, DEFAULT_UTILS_PROBE_INTERVAL_MSEC); >>>>> + >>>>> run_prerequisites(commands, n_commands, idl); >>>>> /* Execute the commands. >>>>> diff --git a/utilities/ovn-ic-sbctl.c b/utilities/ovn-ic-sbctl.c >>>>> index 3060b48b9..215d09d30 100644 >>>>> --- a/utilities/ovn-ic-sbctl.c >>>>> +++ b/utilities/ovn-ic-sbctl.c >>>>> @@ -115,6 +115,10 @@ main(int argc, char *argv[]) >>>>> ovsdb_idl_set_remote(idl, db, false); >>>>> ovsdb_idl_set_db_change_aware(idl, false); >>>>> ovsdb_idl_set_leader_only(idl, leader_only); >>>>> + >>>>> + /* Set reasonable high probe interval. */ >>>>> + set_idl_probe_interval(idl, db, DEFAULT_UTILS_PROBE_INTERVAL_MSEC); >>>>> + >>>>> run_prerequisites(commands, n_commands, idl); >>>>> /* Execute the commands. >>>>> diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c >>>>> index 45572fd30..49c5b294a 100644 >>>>> --- a/utilities/ovn-nbctl.c >>>>> +++ b/utilities/ovn-nbctl.c >>>>> @@ -153,6 +153,20 @@ nbctl_post_execute(struct ovsdb_idl *idl, struct >>>>> ovsdb_idl_txn *txn, >>>>> } >>>>> } >>>>> +static int >>>>> +get_inactivity_probe(struct ovsdb_idl *idl) >>>>> +{ >>>>> + const struct nbrec_nb_global *nb = nbrec_nb_global_first(idl); >>>>> + int interval = -1; >>> >>> Why does this default to -1? Wouldn't it make more sense to use >>> DEFAULT_UTILS_PROBE_INTERVAL_MSEC here? > > Yeap, you’re right. > >>> >>>>> + >>>>> + if (nb) { >>>>> + interval = smap_get_int(&nb->options, "dbctl_probe_interval", >>>>> + interval); >>>>> + } >>>>> + >>>>> + return interval; >>>>> +} >>>>> + >>>>> static char * OVS_WARN_UNUSED_RESULT dhcp_options_get( >>>>> struct ctl_context *ctx, const char *id, bool must_exist, >>>>> const struct nbrec_dhcp_options **); >>>>> @@ -7935,6 +7949,7 @@ main(int argc, char *argv[]) >>>>> .add_base_prerequisites = nbctl_add_base_prerequisites, >>>>> .pre_execute = nbctl_pre_execute, >>>>> .post_execute = nbctl_post_execute, >>>>> + .get_inactivity_probe = get_inactivity_probe, >>>>> .ctx_create = nbctl_ctx_create, >>>>> .ctx_destroy = nbctl_ctx_destroy, >>>>> diff --git a/utilities/ovn-sbctl.c b/utilities/ovn-sbctl.c >>>>> index 542ab9ffa..76465fe7e 100644 >>>>> --- a/utilities/ovn-sbctl.c >>>>> +++ b/utilities/ovn-sbctl.c >>>>> @@ -150,6 +150,20 @@ Other options:\n\ >>>>> * gracefully. */ >>>>> #define ctl_fatal dont_use_ctl_fatal_use_ctl_error_and_return >>>>> +static int >>>>> +get_inactivity_probe(struct ovsdb_idl *idl) >>>>> +{ >>>>> + const struct sbrec_sb_global *sb = sbrec_sb_global_first(idl); >>>>> + int interval = -1; >>> >>> Same comment here about using -1 as the default. >>> >>>>> + >>>>> + if (sb) { >>>>> + interval = smap_get_int(&sb->options, "dbctl_probe_interval", >>>>> + interval); >>>>> + } >>>>> + >>>>> + return interval; >>>>> +} >>>>> + >>>>> /* ovs-sbctl specific context. Inherits the 'struct ctl_context' >>>>> as base. */ >>>>> struct sbctl_context { >>>>> struct ctl_context base; >>>>> @@ -1590,6 +1604,7 @@ main(int argc, char *argv[]) >>>>> .add_base_prerequisites = sbctl_add_base_prerequisites, >>>>> .pre_execute = sbctl_pre_execute, >>>>> .post_execute = NULL, >>>>> + .get_inactivity_probe = get_inactivity_probe, >>>>> .ctx_create = sbctl_ctx_create, >>>>> .ctx_destroy = sbctl_ctx_destroy, >>>> >>> >> >> _______________________________________________ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > Regards, > Vladislav Odintsov > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev