On Tue, Mar 7, 2023 at 5:43 PM Ilya Maximets via discuss
<ovs-discuss@openvswitch.org> wrote:
>
> On 3/7/23 16:58, Vladislav Odintsov wrote:
> > I’ve sent last mail from wrong account and indentation was lost.
> > Resending...
> >
> >> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
> >> <ovs-discuss@openvswitch.org> wrote:
> >>
> >> Thanks Ilya for the quick and detailed response!
> >>
> >>> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
> >>> <ovs-discuss@openvswitch.org> wrote:
> >>>
> >>> On 3/7/23 00:15, Vladislav Odintsov wrote:
> >>>> Hi Ilya,
> >>>>
> >>>> I’m wondering whether there are possible configuration parameters for 
> >>>> ovsdb relay -> main ovsdb server inactivity probe timer.
> >>>> My cluster experiencing issues where relay disconnects from main cluster 
> >>>> due to 5 sec. inactivity probe timeout.
> >>>> Main cluster has quite big database and a bunch of daemons, which 
> >>>> connects to it and it makes difficult to maintain connections in time.
> >>>>
> >>>> For ovsdb relay as a remote I use in-db configuration (to provide 
> >>>> inactivity probe and rbac configuration for ovn-controllers).
> >>>> For ovsdb-server, which serves SB, I just set --remote=pssl:<port>.
> >>>>
> >>>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
> >>>> probe setting, but I’m not sure about the correct way for that.
> >>>>
> >>>> For now I see only two options:
> >>>> 1. Setup custom database scheme with connection table, serve it in same 
> >>>> SB cluster and specify this connection when start ovsdb sb server.
> >>>
> >>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
> >>> used for that purpose.  But you'll need to craft transactions for it
> >>> manually with ovsdb-client.
> >>>
> >>> There is a control tool prepared by Terry:
> >>>  
> >>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
> >>
> >> Thanks for pointing on a patch, I guess, I’ll test it out.
> >>
> >>>
> >>> But it's not in the repo yet (I need to get back to reviews on that
> >>> topic at some point).  The tool itself should be fine, but maybe name
> >>> will change.
> >>
> >> Am I right that in-DB remote configuration must be a hosted by this 
> >> ovsdb-server database?
>
> Yes.
>
> >> What is the best way to configure additional DB on ovsdb-server so that 
> >> this configuration to be permanent?
>
> You may specify multiple database files on the command-line for ovsdb-server
> process.  It will open and serve each of them.  They all can be in different
> modes, e.g. you have multiple clustered, standalone and relay databases in
> the same ovsdb-server process.
>
> There is also ovsdb-server/add-db appctl to add a new database to a running
> process, but it will not survive the restart.
>
> >> Also, am I understand correctly that there is no necessity for this DB to 
> >> be clustered?
>
> It's kind of a point of the Local_Config database to not be clustered.
> The original use case was to allow each cluster member to listen on a
> different IP. i.e. if you don't want to listen on 0.0.0.0 and your
> cluster members are on different nodes, so have different listening IPs.
>
> >>
> >>>
> >>>> 2. Setup second connection in ovn sb database to be used for ovsdb 
> >>>> cluster and deploy cluster separately from ovsdb relay, because they 
> >>>> both start same connections and conflict on ports. (I don’t use docker 
> >>>> here, so I need a separate server for that).
> >>>
> >>> That's an easy option available right now, true.  If they are deployed
> >>> on different nodes, you may even use the same connection record.
> >>>
> >>>>
> >>>> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
> >>>> inactivity probe (say, to 60k), I guess it’s still not enough to have 
> >>>> ovsdb pings every 60 seconds. Inactivity probe must be the same from 
> >>>> both ends - right? From the ovsdb relay process.
> >>>
> >>> Inactivity probes don't need to be the same.  They are separate for each
> >>> side of a connection and so configured separately.
> >>>
> >>> You can set up inactivity probe for the server side of the connection via
> >>> database.  So, server will probe the relay every 60 seconds, but today
> >>> it's not possible to set inactivity probe for the relay-to-server 
> >>> direction.
> >>> So, relay will probe the server every 5 seconds.
> >>>
> >>> The way out from this situation is to allow configuration of relays via
> >>> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
> >>> require addition of a new table to the Local_Config database and allowing
> >>> relay config to be parsed from the database in the code.  That wasn't
> >>> implemented yet.
> >>>
> >>>> I saw your talk on last ovscon about this topic, and the solution was in 
> >>>> progress there. But maybe there were some changes from that time? I’m 
> >>>> ready to test it if any. Or, maybe there’s any workaround?
> >>>
> >>> Sorry, we didn't move forward much on that topic since the presentation.
> >>> There are few unanswered questions around local config database.  Mainly
> >>> regarding upgrades from cmdline/main db -based configuration to a local
> >>> config -based.  But I hope we can figure that out in the current release
> >>> time frame, i.e. before 3.2 release.
> >
> > Regarding configuration method… Just like an idea (I haven’t seen this 
> > variant as one of possible).
> > Remote add/remove is possible via ovsdb-server ctl socket. Could 
> > introducing new command
> > "ovsdb-server/set-remote-param PARAM=VALUE" be a solution here?
>
> Yes, we could.  But it was kind of a point of the OVS Conf. presentation:
> To have a unified way for the database server configuration via the database.
>
> For this way of configuration to be successful, IMHO, we should refrain
> from expanding appctl and command-line interfaces.  Otherwise, we will have
> 3 differently incomplete ways of doing the same thing forever. :/
>
> If you need a quick'n'dirty solution that doesn't survive restarts, appctl
> command should be fairly easy to implement.
>
> >
> >>>
> >>> There is also this workaround:
> >>>  
> >>> https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao....@easystack.cn/
> >>> It simply takes the server->relay inactivity probe value and applies it
> >>> to the relay->server connection.  But it's not a correct solution, because
> >>> it relies on certain database names.
> >>>
> >>> Out of curiosity, what kind of poll intervals you see on your main server
> >>> setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
> >>> solve some of these issues?  3.1 should be noticeably faster than 2.17,
> >>> and also parallel compaction introduced in 3.0 removes one of the big
> >>> reasons for large poll intervals.  OVN upgrade to 22.09+ or even 23.03
> >>> should also help with database sizes.
> >>
> >> We see failures on the OVSDB Relay side:
> >>
> >> 2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response to 
> >> inactivity probe after 5 seconds, disconnecting
> >> 2023-03-06T22:19:32.966Z|00100|reconnect|INFO|ssl:xxx:16642: connection 
> >> dropped
> >> 2023-03-06T22:19:40.989Z|00101|reconnect|INFO|ssl:xxx:16642: connected
> >> 2023-03-06T22:19:50.997Z|00102|reconnect|ERR|ssl:xxx:16642: no response to 
> >> inactivity probe after 5 seconds, disconnecting
> >> 2023-03-06T22:19:50.997Z|00103|reconnect|INFO|ssl:xxx:16642: connection 
> >> dropped
> >> 2023-03-06T22:19:59.022Z|00104|reconnect|INFO|ssl:xxx:16642: connected
> >> 2023-03-06T22:20:09.026Z|00105|reconnect|ERR|ssl:xxx:16642: no response to 
> >> inactivity probe after 5 seconds, disconnecting
> >> 2023-03-06T22:20:09.026Z|00106|reconnect|INFO|ssl:xxx:16642: connection 
> >> dropped
> >> 2023-03-06T22:20:17.052Z|00107|reconnect|INFO|ssl:xxx:16642: connected
> >> 2023-03-06T22:20:27.056Z|00108|reconnect|ERR|ssl:xxx:16642: no response to 
> >> inactivity probe after 5 seconds, disconnecting
> >> 2023-03-06T22:20:27.056Z|00109|reconnect|INFO|ssl:xxx:16642: connection 
> >> dropped
> >> 2023-03-06T22:20:35.111Z|00110|reconnect|INFO|ssl:xxx:16642: connected
> >>
> >> On the DB cluster this looks like:
> >>
> >> 2023-03-06T22:19:04.208Z|00451|stream_ssl|WARN|SSL_read: unexpected SSL 
> >> connection close
> >> 2023-03-06T22:19:04.211Z|00452|reconnect|WARN|ssl:xxx:52590: connection 
> >> dropped (Protocol error)
>
> OK.  These are symptoms.  The cause must be something like
> 'Unreasonably long MANY ms poll interval' on the DB cluster side.
> i.e. the reason why the main DB cluster didn't reply to the
> probes sent from the relay.  Because as soon as server receives
> the probe, it replies right back.  If it didn't reply, it was
> doing something else for an extended period of time.  "MANY" is
> more than 5 seconds.
>
> >> Does it state that configuring inactivity probe on the DB cluster side 
> >> will not help and configuration on the relay side must be done?
>
> Yes.  You likely need a configuration on the relay side.

Sorry for butting into an ongoing discussion, but this part resonated
with one of my past ventures. While investigating a different problem
we kind of hit a similar problem [0]. Aligning client, relay and
backend server configuration has potential to become complicated.
Would an alternative be for the real server and relay server to
exchange this information in-line as part of their communication, for
example exposing it in the special _Server built-in database [1]?

0: 
https://bugs.launchpad.net/ubuntu/lunar/+source/openvswitch/+bug/1998781/comments/3
1: https://github.com/openvswitch/ovs/blob/master/ovsdb/_server.ovsschema

-- 
Frode Nordahl

> >>
> >> We already run OVN 22.09.1 with some backports from next versions.
> >> OVS version is 2.17, so I think it’s possible to try to upgrade OVS to 
> >> 3.1. I’ll take a look on changelog, thanks for pointing this out!
>
> 3.1 should definitely improve the database performance.
> See the other OVSDB talk from the conference for details. :)
>
> P.S. One of the reasons of Sb DB growth and subsequent slowing
> down of the ovsdb-server might be growth of MAC_Binding table.
> MAC_Binding aging is available in 22.09, you can try enabling it
> if that's the problem in your setup (just a guess).
>
> Best regards, Ilya Maximets.
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to