On 10/26/20 2:42 AM, Ilya Maximets wrote: > Compaction happens at most once in 10 minutes. That is a big time > interval for a heavy loaded ovsdb-server in cluster mode. > In 10 minutes raft logs could grow up to tens of thousands of entries > with tens of gigabytes in total size. > While compaction cleans up raft log entries, the memory in many cases > is not returned to the system, but kept in the heap of running > ovsdb-server process, and it could stay in this condition for a really > long time. In the end one performance spike could lead to a fast > growth of the raft log and this memory will never (for a really long > time) be released to the system even if the database if empty. > > Simple example how to reproduce with OVN sandbox: > > 1. make sandbox SANDBOXFLAGS='--nbdb-model=clustered --sbdb-model=clustered' > > 2. Run following script that creates 1 port group, adds 4000 acls and > removes all of that in the end: > > # cat ../memory-test.sh > pg_name=my_port_group > export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file > -vsocket_util:off) > ovn-nbctl pg-add $pg_name > for i in $(seq 1 4000); do > echo "Iteration: $i" > ovn-nbctl --log acl-add $pg_name from-lport $i udp drop > done > ovn-nbctl acl-del $pg_name > ovn-nbctl pg-del $pg_name > ovs-appctl -t $(pwd)/sandbox/nb1 memory/show > ovn-appctl -t ovn-nbctl exit > --- > > 3. Stopping one of Northbound DB servers: > ovs-appctl -t $(pwd)/sandbox/nb1 exit > > Make sure that ovsdb-server didn't compact the database before > it was stopped. Now we have a db file on disk that contains > 4000 fairly big transactions inside. > > 4. Trying to start same ovsdb-server with this file. > > # cd sandbox && ovsdb-server <...> nb1.db > > At this point ovsdb-server reads all the transactions from db > file and performs all of them as fast as it can one by one. > When it finishes this, raft log contains 4000 entries and > ovsdb-server consumes (on my system) ~13GB of memory while > database is empty. And libc will likely never return this memory > back to system, or, at least, will hold it for a really long time. > > This patch adds a new command 'ovsdb-server/memory-trim-on-compaction'. > It's disabled by default, but once enabled, ovsdb-server will call > 'malloc_trim(0)' after every successful compaction to try to return > unused heap memory back to system. This is glibc-specific, so we > need to detect function availability in a build time. > Disabled by default since it adds from 1% to 30% (depending on the > current state) to the snapshot creation time and, also, next memory > allocations will likely require requests to kernel and that might be > slower. Could be enabled by default later if considered broadly > beneficial. > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829 > Signed-off-by: Ilya Maximets <i.maxim...@ovn.org> > ---
Looks good to me, thanks! Acked-by: Dumitru Ceara <dce...@redhat.com> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev