On 4/14/23 15:37, Vladislav Odintsov wrote: > Hi Dumitru, > >> On 13 Apr 2023, at 11:55, Dumitru Ceara <[email protected]> wrote: >> >> Hi Vladislav, >> >> On 4/3/23 12:08, Ilya Maximets wrote: >>> On 3/31/23 17:17, Dumitru Ceara wrote: >>>> On 3/31/23 16:51, Vladislav Odintsov wrote: >>>>> As I understood from Ilya, in case of one-command run of ovn-sbctl >>>>> (non-daemon mode), it doesn’t make sense to have client -> server >>>>> inactivity probes. If cable is unplugged, will just hit tcp session >>>>> timeout, IIUC. >>>>> Please correct me if I’m wrong. >>>>> >>>> >>>> You're right but the default timeouts are quite large for TCP sessions: >>>> >>>> With keepalives default: >>>> tcp_keepalive_time - INTEGER >>>> How often TCP sends out keepalive messages when keepalive is enabled. >>>> Default: 2hours. >>>> >>>> tcp_keepalive_probes - INTEGER >>>> How many keepalive probes TCP sends out, until it decides that the >>>> connection is broken. Default value: 9. >>>> >>>> tcp_keepalive_intvl - INTEGER >>>> How frequently the probes are send out. Multiplied by >>>> tcp_keepalive_probes it is time to kill not responding connection, >>>> after probes started. Default value: 75sec i.e. connection >>>> will be aborted after ~11 minutes of retries. >>>> >>>> DB clients don't even enable tcp keepalives IIRC. >>>> >>>> So then we depend retransmits timing out: >>>> tcp_retries2 - INTEGER >>>> This value influences the timeout of an alive TCP connection, >>>> when RTO retransmissions remain unacknowledged. >>>> Given a value of N, a hypothetical TCP connection following >>>> exponential backoff with an initial RTO of TCP_RTO_MIN would >>>> retransmit N times before killing the connection at the (N+1)th RTO. >>>> >>>> The default value of 15 yields a hypothetical timeout of 924.6 >>>> seconds and is a lower bound for the effective timeout. >>>> TCP will effectively time out at the first RTO which exceeds the >>>> hypothetical timeout. >>>> >>>> RFC 1122 recommends at least 100 seconds for the timeout, >>>> which corresponds to a value of at least 8. >>>> >>>> Isn't it possible that the connection suddenly goes down in between a >>>> transaction request and its reply but with all TCP segments being >>>> ACKed? >>> >>> Right. If the client already sent the request and only waits for a >>> reply, >>> the retransmission timeout will not have any effect. An idle TCP session >>> may stay open practically forever, because it is designed to survive >>> network >>> disruptions. There is no such thing as a TCP timeout in a general case. >>> >> >> I was wondering if you have more thoughts on this? Should we go for the >> approach where OVN sets the probe interval to a "reasonably large" >> value by default? For example, using 30s would mean failure after ~60 >> seconds of inactivity. > > Actually I’m okay with your proposal to set a reasonable high inactivity > probe (120/180 seconds?). > Maybe it should be not only for non-daemon operation of dbctl but also > for all modes? >
It's probably a good idea, yes. > Alternatively, I thought it could be useful to add a new command line > option like --inactivity-probe=N for the utilities to allow the user to > choose and set interval which is needed in each case. > > What do you think about it? > Northd uses NB_Global.options:northd_probe_interval as inactivity interval. Can we do the same thing and use as default the high value until the first successful connection to the NB? >> >> Thanks, >> Dumitru >> >>>> >>>>>> On 31 Mar 2023, at 15:55, Dumitru Ceara <[email protected]> wrote: >>>>>> >>>>>> On 3/31/23 14:46, Vladislav Odintsov wrote: >>>>>>> Hi Dumitru, >>>>>>> >>>>>>>> On 31 Mar 2023, at 15:01, Dumitru Ceara <[email protected]> wrote: >>>>>>>> >>>>>>>> On 3/31/23 11:43, Ales Musil wrote: >>>>>>>>> On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov >>>>>>>>> <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> For large OVN_Southbound DBs defatult interval of 5000 ms >>>>>>>>>> could be not >>>>>>>>>> sufficient. This patch disables OVSDB inactivity probes for >>>>>>>>>> ovn-*ctl >>>>>>>>>> running >>>>>>>>>> in non-daemon mode. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Vladislav Odintsov <[email protected]> >>>>>>>>>> --- >>>>>>>>>> utilities/ovn-dbctl.c | 3 +++ >>>>>>>>>> 1 file changed, 3 insertions(+) >>>>>>>>>> >>>>>>>>>> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c >>>>>>>>>> index 369a6a663..4307a5cae 100644 >>>>>>>>>> --- a/utilities/ovn-dbctl.c >>>>>>>>>> +++ b/utilities/ovn-dbctl.c >>>>>>>>>> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[], >>>>>>>>>> if (daemon_mode) { >>>>>>>>>> server_loop(dbctl_options, idl, argc, argv_); >>>>>>>>>> } else { >>>>>>>>>> + /* Disable OVSDB probe interval for non-daemon mode. */ >>>>>>>>>> + ovsdb_idl_set_probe_interval(idl, 0); >>>>>>>> >>>>>>>> I think I'd avoid using the idl function directly and call instead: >>>>>>>> >>>>>>>> set_idl_probe_interval(idl, 0); >>>>>>>> >>>>>>>> Just to keep it aligned with all other uses in OVN. I can patch >>>>>>>> that at >>>>>>>> apply time if it looks OK to you. >>>>>>> >>>>>>> I’ve got no objections here. >>>>>>> Small nit: set_idl_probe_interval function needs also a remote. >>>>>>> Like this: >>>>>>> >>>>>>> set_idl_probe_interval(idl, db, 0); >>>>>>> >>>>>>> Also, please correct typo in commit message: defatult -> default. >>>>>>> >>>>>> >>>>>> In light of the ovs-discuss thread [0] is it maybe better to just set >>>>>> this probe interval to a very high value instead? That's for the case >>>>>> when ovn-nbctl/sbctl daemon <-> ovsdb-server connection dies >>>>>> because of >>>>>> for example cable being unplugged somewhere on the way between the >>>>>> two. >>>>>> >>>>>> [0] >>>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052324.html >>>>>> >>>>>>>> >>>>>>>>>> + >>>>>>>>>> struct ctl_command *commands; >>>>>>>>>> size_t n_commands; >>>>>>>>>> char *error; >>>>>>>>>> -- >>>>>>>>>> 2.36.1 >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> dev mailing list >>>>>>>>>> [email protected] >>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Looks good to me, thanks. >>>>>>>>> >>>>>>>>> Reviewed-by: Ales Musil <[email protected]> >>>>>>>>> >>>>>>>> >>>>>>>> Vladislav, Ales, I was thinking of backporting this to stable >>>>>>>> branches >>>>>>>> too, what do you think? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Dumitru >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Vladislav Odintsov >>>>>>> >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> dev mailing list >>>>>> [email protected] <mailto:[email protected]> >>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>> >>>>> >>>>> Regards, >>>>> Vladislav Odintsov >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> dev mailing list >>>> [email protected] >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>> >> > > > Regards, > Vladislav Odintsov > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
