Hi Dumitru, > On 13 Apr 2023, at 11:55, Dumitru Ceara <[email protected]> wrote: > > Hi Vladislav, > > On 4/3/23 12:08, Ilya Maximets wrote: >> On 3/31/23 17:17, Dumitru Ceara wrote: >>> On 3/31/23 16:51, Vladislav Odintsov wrote: >>>> As I understood from Ilya, in case of one-command run of ovn-sbctl >>>> (non-daemon mode), it doesn’t make sense to have client -> server >>>> inactivity probes. If cable is unplugged, will just hit tcp session >>>> timeout, IIUC. >>>> Please correct me if I’m wrong. >>>> >>> >>> You're right but the default timeouts are quite large for TCP sessions: >>> >>> With keepalives default: >>> tcp_keepalive_time - INTEGER >>> How often TCP sends out keepalive messages when keepalive is enabled. >>> Default: 2hours. >>> >>> tcp_keepalive_probes - INTEGER >>> How many keepalive probes TCP sends out, until it decides that the >>> connection is broken. Default value: 9. >>> >>> tcp_keepalive_intvl - INTEGER >>> How frequently the probes are send out. Multiplied by >>> tcp_keepalive_probes it is time to kill not responding connection, >>> after probes started. Default value: 75sec i.e. connection >>> will be aborted after ~11 minutes of retries. >>> >>> DB clients don't even enable tcp keepalives IIRC. >>> >>> So then we depend retransmits timing out: >>> tcp_retries2 - INTEGER >>> This value influences the timeout of an alive TCP connection, >>> when RTO retransmissions remain unacknowledged. >>> Given a value of N, a hypothetical TCP connection following >>> exponential backoff with an initial RTO of TCP_RTO_MIN would >>> retransmit N times before killing the connection at the (N+1)th RTO. >>> >>> The default value of 15 yields a hypothetical timeout of 924.6 >>> seconds and is a lower bound for the effective timeout. >>> TCP will effectively time out at the first RTO which exceeds the >>> hypothetical timeout. >>> >>> RFC 1122 recommends at least 100 seconds for the timeout, >>> which corresponds to a value of at least 8. >>> >>> Isn't it possible that the connection suddenly goes down in between a >>> transaction request and its reply but with all TCP segments being >>> ACKed? >> >> Right. If the client already sent the request and only waits for a reply, >> the retransmission timeout will not have any effect. An idle TCP session >> may stay open practically forever, because it is designed to survive network >> disruptions. There is no such thing as a TCP timeout in a general case. >> > > I was wondering if you have more thoughts on this? Should we go for the > approach where OVN sets the probe interval to a "reasonably large" > value by default? For example, using 30s would mean failure after ~60 > seconds of inactivity.
Actually I’m okay with your proposal to set a reasonable high inactivity probe (120/180 seconds?). Maybe it should be not only for non-daemon operation of dbctl but also for all modes? Alternatively, I thought it could be useful to add a new command line option like --inactivity-probe=N for the utilities to allow the user to choose and set interval which is needed in each case. What do you think about it? > > Thanks, > Dumitru > >>> >>>>> On 31 Mar 2023, at 15:55, Dumitru Ceara <[email protected]> wrote: >>>>> >>>>> On 3/31/23 14:46, Vladislav Odintsov wrote: >>>>>> Hi Dumitru, >>>>>> >>>>>>> On 31 Mar 2023, at 15:01, Dumitru Ceara <[email protected]> wrote: >>>>>>> >>>>>>> On 3/31/23 11:43, Ales Musil wrote: >>>>>>>> On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> For large OVN_Southbound DBs defatult interval of 5000 ms could be not >>>>>>>>> sufficient. This patch disables OVSDB inactivity probes for ovn-*ctl >>>>>>>>> running >>>>>>>>> in non-daemon mode. >>>>>>>>> >>>>>>>>> Signed-off-by: Vladislav Odintsov <[email protected]> >>>>>>>>> --- >>>>>>>>> utilities/ovn-dbctl.c | 3 +++ >>>>>>>>> 1 file changed, 3 insertions(+) >>>>>>>>> >>>>>>>>> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c >>>>>>>>> index 369a6a663..4307a5cae 100644 >>>>>>>>> --- a/utilities/ovn-dbctl.c >>>>>>>>> +++ b/utilities/ovn-dbctl.c >>>>>>>>> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[], >>>>>>>>> if (daemon_mode) { >>>>>>>>> server_loop(dbctl_options, idl, argc, argv_); >>>>>>>>> } else { >>>>>>>>> + /* Disable OVSDB probe interval for non-daemon mode. */ >>>>>>>>> + ovsdb_idl_set_probe_interval(idl, 0); >>>>>>> >>>>>>> I think I'd avoid using the idl function directly and call instead: >>>>>>> >>>>>>> set_idl_probe_interval(idl, 0); >>>>>>> >>>>>>> Just to keep it aligned with all other uses in OVN. I can patch that at >>>>>>> apply time if it looks OK to you. >>>>>> >>>>>> I’ve got no objections here. >>>>>> Small nit: set_idl_probe_interval function needs also a remote. Like >>>>>> this: >>>>>> >>>>>> set_idl_probe_interval(idl, db, 0); >>>>>> >>>>>> Also, please correct typo in commit message: defatult -> default. >>>>>> >>>>> >>>>> In light of the ovs-discuss thread [0] is it maybe better to just set >>>>> this probe interval to a very high value instead? That's for the case >>>>> when ovn-nbctl/sbctl daemon <-> ovsdb-server connection dies because of >>>>> for example cable being unplugged somewhere on the way between the two. >>>>> >>>>> [0] >>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052324.html >>>>> >>>>>>> >>>>>>>>> + >>>>>>>>> struct ctl_command *commands; >>>>>>>>> size_t n_commands; >>>>>>>>> char *error; >>>>>>>>> -- >>>>>>>>> 2.36.1 >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> dev mailing list >>>>>>>>> [email protected] >>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>>> >>>>>>>>> >>>>>>>> Looks good to me, thanks. >>>>>>>> >>>>>>>> Reviewed-by: Ales Musil <[email protected]> >>>>>>>> >>>>>>> >>>>>>> Vladislav, Ales, I was thinking of backporting this to stable branches >>>>>>> too, what do you think? >>>>>>> >>>>>>> Thanks, >>>>>>> Dumitru >>>>>> >>>>>> >>>>>> Regards, >>>>>> Vladislav Odintsov >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> [email protected] <mailto:[email protected]> >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>> >>>> >>>> Regards, >>>> Vladislav Odintsov >>>> >>>> >>> >>> _______________________________________________ >>> dev mailing list >>> [email protected] >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > Regards, Vladislav Odintsov _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
