Hi Dumitru,

I’m a bit concerned about our last conversation’s idea how to set inactivity 
probe.
The point is that in general case dbctl module can be used by multiple existing 
and possible future ovn-*ctl utilities.
So, to configure a basic (we may call it a "startup" idl inactivity probe), 
which is hardcoded, let’s say to 120000 ms, we should do this inside dbctl.c 
file.
But if we’re talking about to read the contents of DB and try to find 
inactivity probe configuration there - such code must be in particular utility: 
ovn-sbctl/ovn-nbctl/etc, right? And it seems to me that the second 
configuration attempt of inactivity probe makes sense only in case of 
daemon-mode.
So it’s needed after initial load of DB contents to read appropriate DB and 
reconfigure interval.

I’ve posted v2 [1], please take a look on it. Let me know if you have any 
thoughts/concern.

1: 
https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/

> On 18 Apr 2023, at 18:31, Vladislav Odintsov <[email protected]> wrote:
> 
> 
> 
>> On 18 Apr 2023, at 16:59, Dumitru Ceara <[email protected]> wrote:
>> 
>> On 4/14/23 15:37, Vladislav Odintsov wrote:
>>> Hi Dumitru,
>>> 
>>>> On 13 Apr 2023, at 11:55, Dumitru Ceara <[email protected]> wrote:
>>>> 
>>>> Hi Vladislav,
>>>> 
>>>> On 4/3/23 12:08, Ilya Maximets wrote:
>>>>> On 3/31/23 17:17, Dumitru Ceara wrote:
>>>>>> On 3/31/23 16:51, Vladislav Odintsov wrote:
>>>>>>> As I understood from Ilya, in case of one-command run of ovn-sbctl
>>>>>>> (non-daemon mode), it doesn’t make sense to have client -> server
>>>>>>> inactivity probes. If cable is unplugged, will just hit tcp session
>>>>>>> timeout, IIUC.
>>>>>>> Please correct me if I’m wrong.
>>>>>>> 
>>>>>> 
>>>>>> You're right but the default timeouts are quite large for TCP sessions:
>>>>>> 
>>>>>> With keepalives default:
>>>>>> tcp_keepalive_time - INTEGER
>>>>>> How often TCP sends out keepalive messages when keepalive is enabled.
>>>>>> Default: 2hours.
>>>>>> 
>>>>>> tcp_keepalive_probes - INTEGER
>>>>>> How many keepalive probes TCP sends out, until it decides that the
>>>>>> connection is broken. Default value: 9.
>>>>>> 
>>>>>> tcp_keepalive_intvl - INTEGER
>>>>>> How frequently the probes are send out. Multiplied by
>>>>>> tcp_keepalive_probes it is time to kill not responding connection,
>>>>>> after probes started. Default value: 75sec i.e. connection
>>>>>> will be aborted after ~11 minutes of retries.
>>>>>> 
>>>>>> DB clients don't even enable tcp keepalives IIRC.
>>>>>> 
>>>>>> So then we depend retransmits timing out:
>>>>>> tcp_retries2 - INTEGER
>>>>>> This value influences the timeout of an alive TCP connection,
>>>>>> when RTO retransmissions remain unacknowledged.
>>>>>> Given a value of N, a hypothetical TCP connection following
>>>>>> exponential backoff with an initial RTO of TCP_RTO_MIN would
>>>>>> retransmit N times before killing the connection at the (N+1)th RTO.
>>>>>> 
>>>>>> The default value of 15 yields a hypothetical timeout of 924.6
>>>>>> seconds and is a lower bound for the effective timeout.
>>>>>> TCP will effectively time out at the first RTO which exceeds the
>>>>>> hypothetical timeout.
>>>>>> 
>>>>>> RFC 1122 recommends at least 100 seconds for the timeout,
>>>>>> which corresponds to a value of at least 8.
>>>>>> 
>>>>>> Isn't it possible that the connection suddenly goes down in between a
>>>>>> transaction request and its reply but with all TCP segments being
>>>>>> ACKed?
>>>>> 
>>>>> Right.  If the client already sent the request and only waits for a
>>>>> reply,
>>>>> the retransmission timeout will not have any effect.  An idle TCP session
>>>>> may stay open practically forever, because it is designed to survive
>>>>> network
>>>>> disruptions.  There is no such thing as a TCP timeout in a general case.
>>>>> 
>>>> 
>>>> I was wondering if you have more thoughts on this?  Should we go for the
>>>> approach where OVN sets the probe interval to a "reasonably large"
>>>> value by default?  For example, using 30s would mean failure after ~60
>>>> seconds of inactivity.
>>> 
>>> Actually I’m okay with your proposal to set a reasonable high inactivity
>>> probe (120/180 seconds?).
>>> Maybe it should be not only for non-daemon operation of dbctl but also
>>> for all modes?
>>> 
>> 
>> It's probably a good idea, yes.
>> 
>>> Alternatively, I thought it could be useful to add a new command line
>>> option like --inactivity-probe=N for the utilities to allow the user to
>>> choose and set interval which is needed in each case.
>>> 
>>> What do you think about it?
>>> 
>> 
>> Northd uses NB_Global.options:northd_probe_interval as inactivity
>> interval.  Can we do the same thing and use as default the high value
>> until the first successful connection to the NB?
> 
> Good idea, I’ll dig into this next week, thanks.
> 
>> 
>>>> 
>>>> Thanks,
>>>> Dumitru
>>>> 
>>>>>> 
>>>>>>>> On 31 Mar 2023, at 15:55, Dumitru Ceara <[email protected]> wrote:
>>>>>>>> 
>>>>>>>> On 3/31/23 14:46, Vladislav Odintsov wrote:
>>>>>>>>> Hi Dumitru,
>>>>>>>>> 
>>>>>>>>>> On 31 Mar 2023, at 15:01, Dumitru Ceara <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>> On 3/31/23 11:43, Ales Musil wrote:
>>>>>>>>>>> On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov
>>>>>>>>>>> <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> For large OVN_Southbound DBs defatult interval of 5000 ms
>>>>>>>>>>>> could be not
>>>>>>>>>>>> sufficient.  This patch disables OVSDB inactivity probes for
>>>>>>>>>>>> ovn-*ctl
>>>>>>>>>>>> running
>>>>>>>>>>>> in non-daemon mode.
>>>>>>>>>>>> 
>>>>>>>>>>>> Signed-off-by: Vladislav Odintsov <[email protected]>
>>>>>>>>>>>> ---
>>>>>>>>>>>> utilities/ovn-dbctl.c | 3 +++
>>>>>>>>>>>> 1 file changed, 3 insertions(+)
>>>>>>>>>>>> 
>>>>>>>>>>>> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
>>>>>>>>>>>> index 369a6a663..4307a5cae 100644
>>>>>>>>>>>> --- a/utilities/ovn-dbctl.c
>>>>>>>>>>>> +++ b/utilities/ovn-dbctl.c
>>>>>>>>>>>> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[],
>>>>>>>>>>>>  if (daemon_mode) {
>>>>>>>>>>>>      server_loop(dbctl_options, idl, argc, argv_);
>>>>>>>>>>>>  } else {
>>>>>>>>>>>> +        /* Disable OVSDB probe interval for non-daemon mode. */
>>>>>>>>>>>> +        ovsdb_idl_set_probe_interval(idl, 0);
>>>>>>>>>> 
>>>>>>>>>> I think I'd avoid using the idl function directly and call instead:
>>>>>>>>>> 
>>>>>>>>>> set_idl_probe_interval(idl, 0);
>>>>>>>>>> 
>>>>>>>>>> Just to keep it aligned with all other uses in OVN.  I can patch
>>>>>>>>>> that at
>>>>>>>>>> apply time if it looks OK to you.
>>>>>>>>> 
>>>>>>>>> I’ve got no objections here.
>>>>>>>>> Small nit: set_idl_probe_interval function needs also a remote.
>>>>>>>>> Like this:
>>>>>>>>> 
>>>>>>>>> set_idl_probe_interval(idl, db, 0);
>>>>>>>>> 
>>>>>>>>> Also, please correct typo in commit message: defatult -> default.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> In light of the ovs-discuss thread [0] is it maybe better to just set
>>>>>>>> this probe interval to a very high value instead?  That's for the case
>>>>>>>> when ovn-nbctl/sbctl daemon <-> ovsdb-server connection dies
>>>>>>>> because of
>>>>>>>> for example cable being unplugged somewhere on the way between the
>>>>>>>> two.
>>>>>>>> 
>>>>>>>> [0]
>>>>>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052324.html
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>>> +
>>>>>>>>>>>>      struct ctl_command *commands;
>>>>>>>>>>>>      size_t n_commands;
>>>>>>>>>>>>      char *error;
>>>>>>>>>>>> --
>>>>>>>>>>>> 2.36.1
>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> dev mailing list
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> Looks good to me, thanks.
>>>>>>>>>>> 
>>>>>>>>>>> Reviewed-by: Ales Musil <[email protected]>
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Vladislav, Ales, I was thinking of backporting this to stable
>>>>>>>>>> branches
>>>>>>>>>> too, what do you think?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Dumitru
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Vladislav Odintsov
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> dev mailing list
>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>> 
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Vladislav Odintsov
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> dev mailing list
>>>>>> [email protected]
>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>> 
>>>> 
>>> 
>>> 
>>> Regards,
>>> Vladislav Odintsov
> 
> 
> Regards,
> Vladislav Odintsov
> 
> _______________________________________________
> dev mailing list
> [email protected] <mailto:[email protected]>
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Regards,
Vladislav Odintsov

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to