On Wed, Mar 06, 2024 at 10:29:29AM +0300, Vladislav Odintsov wrote: > Hi Felix, > > > On 6 Mar 2024, at 10:16, Felix Huettner via discuss > > <ovs-discuss@openvswitch.org> wrote: > > > > Hi Srini, > > > > i can share what works for us for ~1k hypervisors: > > > > On Tue, Mar 05, 2024 at 09:51:43PM -0800, Sri kor via discuss wrote: > >> Hi Team, > >> > >> > >> Currently , we are using OVN in RAFT cluster mode. We have 3 NB and SB > >> ovsdb-servers operating in RAFT cluster mode. Currently we have 500 > >> hypervisors connected to this RAFT cluster. > >> > >> For our next deployment, our scale would increase to 3000 hypervisors. To > >> accommodate this scaled hypervisors, we are migrating to DB relay with > >> multigroup deployment model. This increase helps with OVN SB DB read > >> transactions. But for write transactions, only the leader in the RAFT > >> cluster can update the DB. This creates a load on the leader of RAFT. Is > >> there a way to address the load on the RAFT cluster leader? > > > > We do the following: > > * If you need TLS on the ovsdb path, separate it out to some > > reverseproxy that can do just L4 TLS Termination (e.g. traefik, or so) > > Do I understand correctly that with such TLS "offload" you can’t use RBAC for > hypervisors? >
yes, that is the unfortunate side effect > > * Have nobody besides northd connect to the SB DB directly, everyone > > else needs to use a relay > > * Do not run backups on the cluster leader, but on one of the current > > followers > > * Increase the raft election timeout significantly (we have 120s in > > there). However there is a patch afaik in 3.3 that makes that better > > * If you create metrics or so from database content generate these on > > the relays instead of the raft cluster > > > > Overall when our southbound db had issues most of the time it was some > > client constantly reconnecting to it and thereby pulling always a full > > DB dump. > > > >> > >> > >> As the scale increases, number updates coming to the ovn-controller from > >> OVN SB increases. that creates pressure on ovn-controller. Is there a way > >> to minimize the load on ovn-controller? > > > > Did not see any kind of issue there yet. > > However if you are using some python tooling outside of OVN (e.g. > > Openstack) ensure that you have JSON parsing using a C library avaialble > > in the ovs lib. This brings significant performance benefts if you have > > a lot of updates. > > You can check with `python3 -c "import ovs.json; print(ovs.json.PARSER)"` > > which should return "C". > > > >> > >> I wish there is a way for ovn-controller to subscribe to updates specific > >> to this hypervisor. Are there any known ovn-contrller subscription methods > >> available and being used OVS community? > > > > Yes, they do that per default. However for us we saw that this creates > > increased load on the relays due to the needed additional filtering and > > json serializing per target node. So we turned it of and thereby trade > > less ovsdb load for more network bandwidth. > > Relevant setting is `external_ids:ovn-monitor-all`. > > > > Thanks > > Felix > > > >> > >> > >> How can I optimize the load on the leader node in an OVN RAFT cluster to > >> handle increased write transactions? > >> > >> > >> > >> Thanks, > >> > >> Srini > > > >> _______________________________________________ > >> discuss mailing list > >> disc...@openvswitch.org > >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > > _______________________________________________ > > discuss mailing list > > disc...@openvswitch.org <mailto:disc...@openvswitch.org> > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > Regards, > Vladislav Odintsov > _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss