On 10/28/20 11:57 AM, Dumitru Ceara wrote:
> On 10/26/20 2:42 AM, Ilya Maximets wrote:
>> Compaction happens at most once in 10 minutes.  That is a big time
>> interval for a heavy loaded ovsdb-server in cluster mode.
>> In 10 minutes raft logs could grow up to tens of thousands of entries
>> with tens of gigabytes in total size.
>> While compaction cleans up raft log entries, the memory in many cases
>> is not returned to the system, but kept in the heap of running
>> ovsdb-server process, and it could stay in this condition for a really
>> long time.  In the end one performance spike could lead to a fast
>> growth of the raft log and this memory will never (for a really long
>> time) be released to the system even if the database if empty.
>>
>> Simple example how to reproduce with OVN sandbox:
>>
>> 1. make sandbox SANDBOXFLAGS='--nbdb-model=clustered --sbdb-model=clustered'
>>
>> 2. Run following script that creates 1 port group, adds 4000 acls and
>>    removes all of that in the end:
>>
>>    # cat ../memory-test.sh
>>    pg_name=my_port_group
>>    export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file 
>> -vsocket_util:off)
>>    ovn-nbctl pg-add $pg_name
>>    for i in $(seq 1 4000); do
>>      echo "Iteration: $i"
>>      ovn-nbctl --log acl-add $pg_name from-lport $i udp drop
>>    done
>>    ovn-nbctl acl-del $pg_name
>>    ovn-nbctl pg-del $pg_name
>>    ovs-appctl -t $(pwd)/sandbox/nb1 memory/show
>>    ovn-appctl -t ovn-nbctl exit
>>    ---
>>
>> 3. Stopping one of Northbound DB servers:
>>    ovs-appctl -t $(pwd)/sandbox/nb1 exit
>>
>>    Make sure that ovsdb-server didn't compact the database before
>>    it was stopped.  Now we have a db file on disk that contains
>>    4000 fairly big transactions inside.
>>
>> 4. Trying to start same ovsdb-server with this file.
>>
>>    # cd sandbox && ovsdb-server <...> nb1.db
>>
>>    At this point ovsdb-server reads all the transactions from db
>>    file and performs all of them as fast as it can one by one.
>>    When it finishes this, raft log contains 4000 entries and
>>    ovsdb-server consumes (on my system) ~13GB of memory while
>>    database is empty.  And libc will likely never return this memory
>>    back to system, or, at least, will hold it for a really long time.
>>
>> This patch adds a new command 'ovsdb-server/memory-trim-on-compaction'.
>> It's disabled by default, but once enabled, ovsdb-server will call
>> 'malloc_trim(0)' after every successful compaction to try to return
>> unused heap memory back to system.  This is glibc-specific, so we
>> need to detect function availability in a build time.
>> Disabled by default since it adds from 1% to 30% (depending on the
>> current state) to the snapshot creation time and, also, next memory
>> allocations will likely require requests to kernel and that might be
>> slower.  Could be enabled by default later if considered broadly
>> beneficial.
>>
>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829
>> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
>> ---
> 
> Looks good to me, thanks!
> 
> Acked-by: Dumitru Ceara <dce...@redhat.com>
> 


Thanks!  There was a small issue with ifdef since the value always defined
but with different values.  I fixed it by s/ifdef/if/  and s/ifndef /if !/.
With that applied to master.
Will backport down to 2.13 with dependencies as soon as TravisCI finishes the 
check.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to