Hi Srini,

i can share what works for us for ~1k hypervisors:

On Tue, Mar 05, 2024 at 09:51:43PM -0800, Sri kor via discuss wrote:
> Hi Team,
> 
> 
> Currently , we are using OVN in RAFT cluster mode. We have 3 NB and SB
> ovsdb-servers operating in RAFT cluster mode. Currently we have 500
> hypervisors connected to this RAFT cluster.
> 
> For our next deployment, our scale would increase to 3000 hypervisors. To
> accommodate this scaled hypervisors, we are migrating to DB relay with
> multigroup deployment model. This increase helps with OVN SB DB read
> transactions. But for write transactions, only the leader in the RAFT
> cluster can update the DB. This creates a load on the leader of RAFT. Is
> there a way to address the load on the RAFT cluster leader?

We do the following:
* If you need TLS on the ovsdb path, separate it out to some
  reverseproxy that can do just L4 TLS Termination (e.g. traefik, or so)
* Have nobody besides northd connect to the SB DB directly, everyone
  else needs to use a relay
* Do not run backups on the cluster leader, but on one of the current
  followers
* Increase the raft election timeout significantly (we have 120s in
  there). However there is a patch afaik in 3.3 that makes that better
* If you create metrics or so from database content generate these on
  the relays instead of the raft cluster

Overall when our southbound db had issues most of the time it was some
client constantly reconnecting to it and thereby pulling always a full
DB dump.

> 
> 
> As the scale increases, number updates coming to the ovn-controller from
> OVN SB increases. that creates pressure on ovn-controller. Is there a way
> to minimize the load on ovn-controller?

Did not see any kind of issue there yet.
However if you are using some python tooling outside of OVN (e.g.
Openstack) ensure that you have JSON parsing using a C library avaialble
in the ovs lib. This brings significant performance benefts if you have
a lot of updates.
You can check with `python3 -c "import ovs.json; print(ovs.json.PARSER)"`
which should return "C".

> 
> I wish there is a way for ovn-controller to subscribe to updates specific
> to this hypervisor. Are there any known ovn-contrller subscription methods
> available and being used OVS community?

Yes, they do that per default. However for us we saw that this creates
increased load on the relays due to the needed additional filtering and
json serializing per target node. So we turned it of and thereby trade
less ovsdb load for more network bandwidth.
Relevant setting is `external_ids:ovn-monitor-all`.

Thanks
Felix

> 
> 
> How can I optimize the load on the leader node in an OVN RAFT cluster to
> handle increased write transactions?
> 
> 
> 
> Thanks,
> 
> Srini

> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to