On 8/3/22 16:43, Ilya Maximets wrote: > Some CMSes, like ovn-kubernetes, tend to create load balancers attached > to a big number of logical switches or routers. For each of these > load balancers northd creates a record in Load_Balancer table of the > Southbound database with all the logical datapaths (switches, routers) > listed in the 'datapaths' column. All these logical datapaths are > references to datapath bindings. With large number of load balancers > like these applied to the same set of load balancers, the size of > the Southbound database can grow significantly and these references > can take up to 90% of all the traffic between Sb DB and ovn-controllers. > > For example, while creating 3 load balancers (1 for tcp, udp and sctp) > in a 250-node cluster in ovn-heater cluster-density test, the database > update sent to every ovn-controller is about 40 KB in size, out of > which 36 KB are just UUIDs of 250 logical datapaths repeated 3 times. > > Introducing a new column 'datapath_group' in a Load_Balancer table, > so we can create a group once and just refer to it from each load > balancer. This saves a lot of CPU time, memory and network bandwidth. > Re-using already existing Logical_DP_Group table to store these groups. > > In 250 node cluster-density test with ovn-heater that creates 30K load > balancers applied to all 250 logical switches the following improvement > was observed: > > - Southbound database size decreased from 310 MB to 118 MB. > - CPU time on Southbound ovsdb-server process decreased by a > factor of 7, from ~35 minutes per server to ~5. > - Memory consumption on Southbound ovsdb-server process decreased > from 12 GB (peaked at ~14) RSS down to ~1 GB (peaked at ~2). > - Memory consumption on ovn-controller processes decreased > by 400 MB on each node, from 1300 MB to 900 MB. Similar decrease > observed for ovn-northd as well. > > We're adding some extra work to ovn-northd process with this change > to find/create datapath groups. CPU time in the test above increased > by ~10%, but overall round trip time for changes in OVN to be > propagated to OVN controllers is still noticeably lower due to > improvements in other components like Sb DB.
Actually, I was using the data from the prliminary run for this commit message and also looked into a wrong place while checking northd performance numbers :/ . Re-run with the version of the code in this patch shows 12% performance improvement on ovn-northd looking at actual length of poll intervals. So, northd-related paragraph above shuold be thrown away and replaced with one more bullet in observed improvements: - Poll intervals on ovn-northd reduced by 12%. <snip> Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev