On Wed, Mar 06, 2024 at 10:29:29AM +0300, Vladislav Odintsov wrote:
> Hi Felix,
> 
> > On 6 Mar 2024, at 10:16, Felix Huettner via discuss 
> > <ovs-discuss@openvswitch.org> wrote:
> > 
> > Hi Srini,
> > 
> > i can share what works for us for ~1k hypervisors:
> > 
> > On Tue, Mar 05, 2024 at 09:51:43PM -0800, Sri kor via discuss wrote:
> >> Hi Team,
> >> 
> >> 
> >> Currently , we are using OVN in RAFT cluster mode. We have 3 NB and SB
> >> ovsdb-servers operating in RAFT cluster mode. Currently we have 500
> >> hypervisors connected to this RAFT cluster.
> >> 
> >> For our next deployment, our scale would increase to 3000 hypervisors. To
> >> accommodate this scaled hypervisors, we are migrating to DB relay with
> >> multigroup deployment model. This increase helps with OVN SB DB read
> >> transactions. But for write transactions, only the leader in the RAFT
> >> cluster can update the DB. This creates a load on the leader of RAFT. Is
> >> there a way to address the load on the RAFT cluster leader?
> > 
> > We do the following:
> > * If you need TLS on the ovsdb path, separate it out to some
> >  reverseproxy that can do just L4 TLS Termination (e.g. traefik, or so)
> 
> Do I understand correctly that with such TLS "offload" you can’t use RBAC for 
> hypervisors?
> 

yes, that is the unfortunate side effect

> > * Have nobody besides northd connect to the SB DB directly, everyone
> >  else needs to use a relay
> > * Do not run backups on the cluster leader, but on one of the current
> >  followers
> > * Increase the raft election timeout significantly (we have 120s in
> >  there). However there is a patch afaik in 3.3 that makes that better
> > * If you create metrics or so from database content generate these on
> >  the relays instead of the raft cluster
> > 
> > Overall when our southbound db had issues most of the time it was some
> > client constantly reconnecting to it and thereby pulling always a full
> > DB dump.
> > 
> >> 
> >> 
> >> As the scale increases, number updates coming to the ovn-controller from
> >> OVN SB increases. that creates pressure on ovn-controller. Is there a way
> >> to minimize the load on ovn-controller?
> > 
> > Did not see any kind of issue there yet.
> > However if you are using some python tooling outside of OVN (e.g.
> > Openstack) ensure that you have JSON parsing using a C library avaialble
> > in the ovs lib. This brings significant performance benefts if you have
> > a lot of updates.
> > You can check with `python3 -c "import ovs.json; print(ovs.json.PARSER)"`
> > which should return "C".
> > 
> >> 
> >> I wish there is a way for ovn-controller to subscribe to updates specific
> >> to this hypervisor. Are there any known ovn-contrller subscription methods
> >> available and being used OVS community?
> > 
> > Yes, they do that per default. However for us we saw that this creates
> > increased load on the relays due to the needed additional filtering and
> > json serializing per target node. So we turned it of and thereby trade
> > less ovsdb load for more network bandwidth.
> > Relevant setting is `external_ids:ovn-monitor-all`.
> > 
> > Thanks
> > Felix
> > 
> >> 
> >> 
> >> How can I optimize the load on the leader node in an OVN RAFT cluster to
> >> handle increased write transactions?
> >> 
> >> 
> >> 
> >> Thanks,
> >> 
> >> Srini
> > 
> >> _______________________________________________
> >> discuss mailing list
> >> disc...@openvswitch.org
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> > 
> > _______________________________________________
> > discuss mailing list
> > disc...@openvswitch.org <mailto:disc...@openvswitch.org>
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 
> 
> Regards,
> Vladislav Odintsov
> 
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to