On 1/15/21 6:05 PM, Dumitru Ceara wrote: > On 1/14/21 2:11 AM, Ilya Maximets wrote: >> Currently, ovsdb-server stores complete value for the column in a database >> file and in a raft log in case this column changed. This means that >> transaction that adds, for example, one new acl to a port group creates >> a log entry with all UUIDs of all existing acls + one new. Same for >> ports in logical switches and routers and more other columns with sets >> in Northbound DB. >> >> There could be thousands of acls in one port group or thousands of ports >> in a single logical switch. And the typical use case is to add one new >> if we're starting a new service/VM/container or adding one new node in a >> kubernetes or OpenStack cluster. This generates huge amount of traffic >> within ovsdb raft cluster, grows overall memory consumption and hurts >> performance since all these UUIDs are parsed and formatted to/from json >> several times and stored on disks. And more values we have in a set - >> more space a single log entry will occupy and more time it will take to >> process by ovsdb-server cluster members. >> >> Simple test: >> >> 1. Start OVN sandbox with clustered DBs: >> # make sandbox SANDBOXFLAGS='--nbdb-model=clustered >> --sbdb-model=clustered' >> >> 2. Run a script that creates one port group and adds 4000 acls into it: >> # cat ../memory-test.sh >> pg_name=my_port_group >> export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file >> -vsocket_util:off) >> ovn-nbctl pg-add $pg_name >> for i in $(seq 1 4000); do >> echo "Iteration: $i" >> ovn-nbctl --log acl-add $pg_name from-lport $i udp drop >> done >> ovn-nbctl acl-del $pg_name >> ovn-nbctl pg-del $pg_name >> ovs-appctl -t $(pwd)/sandbox/nb1 memory/show >> ovn-appctl -t ovn-nbctl exit >> --- >> >> 4. Check the current memory consumption of ovsdb-server processes and >> space occupied by database files: >> # ls sandbox/[ns]b*.db -alh >> # ps -eo vsz,rss,comm,cmd | egrep '=[ns]b[123].pid' >> >> Test results with current ovsdb log format: >> >> On-disk Nb DB size : ~369 MB >> RSS of Nb ovsdb-servers: ~2.7 GB >> Time to finish the test: ~2m >> >> In order to mitigate memory consumption issues and reduce computational >> load on ovsdb-servers let's store diff between old and new values >> instead. This will make size of each log entry that adds single acl to >> port group (or port to logical switch or anything else like that) very >> small and independent from the number of already existing acls (ports, >> etc.). >> >> Added a new marker '_is_diff' into a file transaction to specify that >> this transaction contains diffs instead of replacements for the existing >> data. >> >> One side effect is that this change will actually increase the size of >> file transaction that removes more than a half of entries from the set, >> because diff will be larger than the resulted new value. However, such >> operations are rare. >> >> Test results with change applied: >> >> On-disk Nb DB size : ~2.7 MB ---> reduced by 99% >> RSS of Nb ovsdb-servers: ~580 MB ---> reduced by 78% >> Time to finish the test: ~1m27s ---> reduced by 27% >> >> After this change new ovsdb-server is still able to read old databases, >> but old ovsdb-server will not be able to read new ones. >> Since new servers could join ovsdb cluster dynamically it's hard to >> implement any runtime mechanism to handle cases where different >> versions of ovsdb-server joins the cluster. However we still need to >> handle cluster upgrades. For this case added special command line >> argument to disable new functionality. Documentation updated with the >> recommended way to upgrade the ovsdb cluster. >> >> Acked-by: Dumitru Ceara <dce...@redhat.com> >> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org> >> --- > > This still looks good to me, thanks!
Thanks! Applied to master. Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev