On 10/26/20 2:42 AM, Ilya Maximets wrote:
> Compaction happens at most once in 10 minutes.  That is a big time
> interval for a heavy loaded ovsdb-server in cluster mode.
> In 10 minutes raft logs could grow up to tens of thousands of entries
> with tens of gigabytes in total size.
> While compaction cleans up raft log entries, the memory in many cases
> is not returned to the system, but kept in the heap of running
> ovsdb-server process, and it could stay in this condition for a really
> long time.  In the end one performance spike could lead to a fast
> growth of the raft log and this memory will never (for a really long
> time) be released to the system even if the database if empty.
> 
> Simple example how to reproduce with OVN sandbox:
> 
> 1. make sandbox SANDBOXFLAGS='--nbdb-model=clustered --sbdb-model=clustered'
> 
> 2. Run following script that creates 1 port group, adds 4000 acls and
>    removes all of that in the end:
> 
>    # cat ../memory-test.sh
>    pg_name=my_port_group
>    export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file 
> -vsocket_util:off)
>    ovn-nbctl pg-add $pg_name
>    for i in $(seq 1 4000); do
>      echo "Iteration: $i"
>      ovn-nbctl --log acl-add $pg_name from-lport $i udp drop
>    done
>    ovn-nbctl acl-del $pg_name
>    ovn-nbctl pg-del $pg_name
>    ovs-appctl -t $(pwd)/sandbox/nb1 memory/show
>    ovn-appctl -t ovn-nbctl exit
>    ---
> 
> 3. Stopping one of Northbound DB servers:
>    ovs-appctl -t $(pwd)/sandbox/nb1 exit
> 
>    Make sure that ovsdb-server didn't compact the database before
>    it was stopped.  Now we have a db file on disk that contains
>    4000 fairly big transactions inside.
> 
> 4. Trying to start same ovsdb-server with this file.
> 
>    # cd sandbox && ovsdb-server <...> nb1.db
> 
>    At this point ovsdb-server reads all the transactions from db
>    file and performs all of them as fast as it can one by one.
>    When it finishes this, raft log contains 4000 entries and
>    ovsdb-server consumes (on my system) ~13GB of memory while
>    database is empty.  And libc will likely never return this memory
>    back to system, or, at least, will hold it for a really long time.
> 
> This patch adds a new command 'ovsdb-server/memory-trim-on-compaction'.
> It's disabled by default, but once enabled, ovsdb-server will call
> 'malloc_trim(0)' after every successful compaction to try to return
> unused heap memory back to system.  This is glibc-specific, so we
> need to detect function availability in a build time.
> Disabled by default since it adds from 1% to 30% (depending on the
> current state) to the snapshot creation time and, also, next memory
> allocations will likely require requests to kernel and that might be
> slower.  Could be enabled by default later if considered broadly
> beneficial.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
> ---

Looks good to me, thanks!

Acked-by: Dumitru Ceara <dce...@redhat.com>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to