On 3/31/23 16:51, Vladislav Odintsov wrote:
> As I understood from Ilya, in case of one-command run of ovn-sbctl 
> (non-daemon mode), it doesn’t make sense to have client -> server inactivity 
> probes. If cable is unplugged, will just hit tcp session timeout, IIUC.
> Please correct me if I’m wrong.
> 

You're right but the default timeouts are quite large for TCP sessions:

With keepalives default:
tcp_keepalive_time - INTEGER
        How often TCP sends out keepalive messages when keepalive is enabled.
        Default: 2hours.

tcp_keepalive_probes - INTEGER
        How many keepalive probes TCP sends out, until it decides that the
        connection is broken. Default value: 9.

tcp_keepalive_intvl - INTEGER
        How frequently the probes are send out. Multiplied by
        tcp_keepalive_probes it is time to kill not responding connection,
        after probes started. Default value: 75sec i.e. connection
        will be aborted after ~11 minutes of retries.

DB clients don't even enable tcp keepalives IIRC.

So then we depend retransmits timing out:
tcp_retries2 - INTEGER
        This value influences the timeout of an alive TCP connection,
        when RTO retransmissions remain unacknowledged.
        Given a value of N, a hypothetical TCP connection following
        exponential backoff with an initial RTO of TCP_RTO_MIN would
        retransmit N times before killing the connection at the (N+1)th RTO.

        The default value of 15 yields a hypothetical timeout of 924.6
        seconds and is a lower bound for the effective timeout.
        TCP will effectively time out at the first RTO which exceeds the
        hypothetical timeout.

        RFC 1122 recommends at least 100 seconds for the timeout,
        which corresponds to a value of at least 8.

Isn't it possible that the connection suddenly goes down in between a
transaction request and its reply but with all TCP segments being
ACKed?

>> On 31 Mar 2023, at 15:55, Dumitru Ceara <dce...@redhat.com> wrote:
>>
>> On 3/31/23 14:46, Vladislav Odintsov wrote:
>>> Hi Dumitru,
>>>
>>>> On 31 Mar 2023, at 15:01, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>
>>>> On 3/31/23 11:43, Ales Musil wrote:
>>>>> On Thu, Mar 23, 2023 at 8:25 PM Vladislav Odintsov <odiv...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> For large OVN_Southbound DBs defatult interval of 5000 ms could be not
>>>>>> sufficient.  This patch disables OVSDB inactivity probes for ovn-*ctl
>>>>>> running
>>>>>> in non-daemon mode.
>>>>>>
>>>>>> Signed-off-by: Vladislav Odintsov <odiv...@gmail.com>
>>>>>> ---
>>>>>> utilities/ovn-dbctl.c | 3 +++
>>>>>> 1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
>>>>>> index 369a6a663..4307a5cae 100644
>>>>>> --- a/utilities/ovn-dbctl.c
>>>>>> +++ b/utilities/ovn-dbctl.c
>>>>>> @@ -208,6 +208,9 @@ ovn_dbctl_main(int argc, char *argv[],
>>>>>>    if (daemon_mode) {
>>>>>>        server_loop(dbctl_options, idl, argc, argv_);
>>>>>>    } else {
>>>>>> +        /* Disable OVSDB probe interval for non-daemon mode. */
>>>>>> +        ovsdb_idl_set_probe_interval(idl, 0);
>>>>
>>>> I think I'd avoid using the idl function directly and call instead:
>>>>
>>>> set_idl_probe_interval(idl, 0);
>>>>
>>>> Just to keep it aligned with all other uses in OVN.  I can patch that at
>>>> apply time if it looks OK to you.
>>>
>>> I’ve got no objections here.
>>> Small nit: set_idl_probe_interval function needs also a remote. Like this:
>>>
>>> set_idl_probe_interval(idl, db, 0);
>>>
>>> Also, please correct typo in commit message: defatult -> default.
>>>
>>
>> In light of the ovs-discuss thread [0] is it maybe better to just set
>> this probe interval to a very high value instead?  That's for the case
>> when ovn-nbctl/sbctl daemon <-> ovsdb-server connection dies because of
>> for example cable being unplugged somewhere on the way between the two.
>>
>> [0]
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052324.html
>>
>>>>
>>>>>> +
>>>>>>        struct ctl_command *commands;
>>>>>>        size_t n_commands;
>>>>>>        char *error;
>>>>>> --
>>>>>> 2.36.1
>>>>>>
>>>>>> _______________________________________________
>>>>>> dev mailing list
>>>>>> d...@openvswitch.org
>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>
>>>>>>
>>>>> Looks good to me, thanks.
>>>>>
>>>>> Reviewed-by: Ales Musil <amu...@redhat.com>
>>>>>
>>>>
>>>> Vladislav, Ales, I was thinking of backporting this to stable branches
>>>> too, what do you think?
>>>>
>>>> Thanks,
>>>> Dumitru
>>>
>>>
>>> Regards,
>>> Vladislav Odintsov
>>>
>>>
>>
>> _______________________________________________
>> dev mailing list
>> d...@openvswitch.org <mailto:d...@openvswitch.org>
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> 
> Regards,
> Vladislav Odintsov
> 
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to