On 1/15/21 6:05 PM, Dumitru Ceara wrote:
> On 1/14/21 2:11 AM, Ilya Maximets wrote:
>> Currently, ovsdb-server stores complete value for the column in a database
>> file and in a raft log in case this column changed.  This means that
>> transaction that adds, for example, one new acl to a port group creates
>> a log entry with all UUIDs of all existing acls + one new.  Same for
>> ports in logical switches and routers and more other columns with sets
>> in Northbound DB.
>>
>> There could be thousands of acls in one port group or thousands of ports
>> in a single logical switch.  And the typical use case is to add one new
>> if we're starting a new service/VM/container or adding one new node in a
>> kubernetes or OpenStack cluster.  This generates huge amount of traffic
>> within ovsdb raft cluster, grows overall memory consumption and hurts
>> performance since all these UUIDs are parsed and formatted to/from json
>> several times and stored on disks.  And more values we have in a set -
>> more space a single log entry will occupy and more time it will take to
>> process by ovsdb-server cluster members.
>>
>> Simple test:
>>
>> 1. Start OVN sandbox with clustered DBs:
>>     # make sandbox SANDBOXFLAGS='--nbdb-model=clustered 
>> --sbdb-model=clustered'
>>
>> 2. Run a script that creates one port group and adds 4000 acls into it:
>>     # cat ../memory-test.sh
>>     pg_name=my_port_group
>>     export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file 
>> -vsocket_util:off)
>>     ovn-nbctl pg-add $pg_name
>>     for i in $(seq 1 4000); do
>>       echo "Iteration: $i"
>>       ovn-nbctl --log acl-add $pg_name from-lport $i udp drop
>>     done
>>     ovn-nbctl acl-del $pg_name
>>     ovn-nbctl pg-del $pg_name
>>     ovs-appctl -t $(pwd)/sandbox/nb1 memory/show
>>     ovn-appctl -t ovn-nbctl exit
>>     ---
>>
>> 4. Check the current memory consumption of ovsdb-server processes and
>>     space occupied by database files:
>>     # ls sandbox/[ns]b*.db -alh
>>     # ps -eo vsz,rss,comm,cmd | egrep '=[ns]b[123].pid'
>>
>> Test results with current ovsdb log format:
>>
>>     On-disk Nb DB size     :  ~369 MB
>>     RSS of Nb ovsdb-servers:  ~2.7 GB
>>     Time to finish the test:  ~2m
>>
>> In order to mitigate memory consumption issues and reduce computational
>> load on ovsdb-servers let's store diff between old and new values
>> instead.  This will make size of each log entry that adds single acl to
>> port group (or port to logical switch or anything else like that) very
>> small and independent from the number of already existing acls (ports,
>> etc.).
>>
>> Added a new marker '_is_diff' into a file transaction to specify that
>> this transaction contains diffs instead of replacements for the existing
>> data.
>>
>> One side effect is that this change will actually increase the size of
>> file transaction that removes more than a half of entries from the set,
>> because diff will be larger than the resulted new value.  However, such
>> operations are rare.
>>
>> Test results with change applied:
>>
>>     On-disk Nb DB size     :  ~2.7 MB  ---> reduced by 99%
>>     RSS of Nb ovsdb-servers:  ~580 MB  ---> reduced by 78%
>>     Time to finish the test:  ~1m27s   ---> reduced by 27%
>>
>> After this change new ovsdb-server is still able to read old databases,
>> but old ovsdb-server will not be able to read new ones.
>> Since new servers could join ovsdb cluster dynamically it's hard to
>> implement any runtime mechanism to handle cases where different
>> versions of ovsdb-server joins the cluster.  However we still need to
>> handle cluster upgrades.  For this case added special command line
>> argument to disable new functionality.  Documentation updated with the
>> recommended way to upgrade the ovsdb cluster.
>>
>> Acked-by: Dumitru Ceara <dce...@redhat.com>
>> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
>> ---
> 
> This still looks good to me, thanks!

Thanks!  Applied to master.

Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to