On Sun, Mar 6, 2016 at 11:02 PM, Lei Huang <huang.f....@gmail.com> wrote:
>
> Hi,
>
>
> During a scalability test, we found that the ovn northbound ovsdb-server’s
> memory usage becomes very high while creating and binding ports, the test
> step is:
>
> 1. Create 1000 sandboxes
>
> 2. Create 5 lswitches and create 200 lports for each of lswitches,  on
> northbound
>
> 3. For each of lport created in step 2, bind it to a sandboxes randomly,
>  then use ‘ovn-nbctl wait-until Logical_Port <port-name> up=true’ to wait
> the port is ‘up’ on northbound
>
> 4. Goto step 2
>
>
> The VmRSS memory usages as below:
>
>
> lport number  ovn-northd   nb ovsdb-server   sb ovsdb-server
>
>     0               1292              1124              7104
>
>  5000              21704           6476420             17424
>
> 10000              41680          25446148             31808
>
> 15000              61600          56893928            133864
>
> 20000              81844         100955012            176868
>
>
> The memory log as below:
>
> 016-03-02T09:48:32.024Z|00003|memory|INFO|1124 kB peak resident set size
> after 10.0 seconds
>
> 2016-03-02T16:24:46.430Z|00005|memory|INFO|peak resident set size grew
150%
> in last 23774.4 seconds, from 1124 kB to 2808 kB
>
> 2016-03-02T16:25:06.432Z|00007|memory|INFO|peak resident set size grew
130%
> in last 20.0 seconds, from 2808 kB to 6448 kB
>
> 2016-03-02T16:25:16.433Z|00009|memory|INFO|peak resident set size grew 54%
> in last 10.0 seconds, from 6448 kB to 9948 kB
>
> 2016-03-02T16:25:26.434Z|00011|memory|INFO|peak resident set size grew 52%
> in last 10.0 seconds, from 9948 kB to 15072 kB
>
> 2016-03-02T16:25:36.436Z|00013|memory|INFO|peak resident set size grew
178%
> in last 10.0 seconds, from 15072 kB to 41956 kB
>
> 2016-03-02T16:26:36.439Z|00015|memory|INFO|peak resident set size grew
140%
> in last 60.0 seconds, from 41956 kB to 100696 kB
>
> 2016-03-02T16:27:36.444Z|00017|memory|INFO|peak resident set size grew 78%
> in last 60.0 seconds, from 100696 kB to 179108 kB
>
> 2016-03-02T16:28:46.444Z|00019|memory|INFO|peak resident set size grew 55%
> in last 70.0 seconds, from 179108 kB to 277412 kB
>
> 2016-03-02T16:39:26.953Z|00021|memory|INFO|peak resident set size grew 50%
> in last 640.5 seconds, from 277412 kB to 417472 kB
>
> 2016-03-02T16:41:06.960Z|00023|memory|INFO|peak resident set size grew 67%
> in last 100.0 seconds, from 417472 kB to 696732 kB
>
> 2016-03-02T16:44:17.180Z|00025|memory|INFO|peak resident set size grew 54%
> in last 190.2 seconds, from 696732 kB to 1071080 kB
>
> 2016-03-02T16:55:02.317Z|00027|memory|INFO|peak resident set size grew 68%
> in last 645.1 seconds, from 1071080 kB to 1794276 kB
>
> 2016-03-02T17:03:22.544Z|00029|memory|INFO|peak resident set size grew 50%
> in last 500.2 seconds, from 1794276 kB to 2698100 kB
>
> 2016-03-02T17:12:18.359Z|00031|memory|INFO|peak resident set size grew 54%
> in last 535.8 seconds, from 2698100 kB to 4158336 kB
>
> 2016-03-02T17:27:59.897Z|00070|memory|INFO|peak resident set size grew 55%
> in last 941.5 seconds, from 4158336 kB to 6458136 kB
>
> 2016-03-02T17:51:04.005Z|00136|memory|INFO|peak resident set size grew 53%
> in last 1384.1 seconds, from 6458136 kB to 9896844 kB
>
> 2016-03-02T18:37:52.913Z|00243|memory|INFO|peak resident set size grew 57%
> in last 2808.9 seconds, from 9896844 kB to 15539132 kB
>
> 2016-03-02T19:21:33.977Z|00414|memory|INFO|peak resident set size grew 51%
> in last 2621.1 seconds, from 15539132 kB to 23403744 kB
>
> 2016-03-02T20:27:53.938Z|00661|memory|INFO|peak resident set size grew 51%
> in last 3980.0 seconds, from 23403744 kB to 35348044 kB
>
> 2016-03-02T22:11:13.200Z|00957|memory|INFO|peak resident set size grew 53%
> in last 6199.3 seconds, from 35348044 kB to 53943084 kB
>
> 2016-03-03T00:49:32.196Z|01306|memory|INFO|peak resident set size grew 51%
> in last 9499.0 seconds, from 53943084 kB to 81658988 kB
>
>
>
> On another small test environment, I checked northbound ovsdb-server by
> valgrind:
>
> $ valgrind --leak-check=full --show-leak-kinds=all ovsdb-server --no-chdir
> --pidfile=ovsdb-server-nb.pid --unixctl=ovsdb-server-nb.ctl -vconsole:off
> -vsyslog:off -vfile:info --log-file=ovsdb-server-nb.log
> --remote=punix:/home/rally/controller-sandbox/db-nb.sock conf-nb.db
ovnnb.db
>
>
> But no obvious memory leak found. So I make ovsdb-server exit without
> cleanup to see what are memory used for, code changed as below:
>
>
> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
>
> index fa662b1..4498eaa 100644
>
> --- a/ovsdb/ovsdb-server.c
>
> +++ b/ovsdb/ovsdb-server.c
>
> @@ -339,7 +339,7 @@ main(int argc, char *argv[])
>
>                               ovsdb_server_disable_monitor2, jsonrpc);
>
>
>
>      main_loop(jsonrpc, &all_dbs, unixctl, &remotes, run_process,
&exiting);
>
> -
>
> +    return 0;
>
>      ovsdb_jsonrpc_server_destroy(jsonrpc);
>
>      SHASH_FOR_EACH_SAFE(node, next, &all_dbs) {
>
>          struct db *db = node->data;
>
>
> After 100 port created and bound, the memory used is about 200M, and the
> output of valgrind as below :
>
> ==23712== 24,994,032 bytes in 520,709 blocks are possibly lost in loss
> record 228 of 229
>
> ==23712==    at 0x4C2CC70: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>
> ==23712==    by 0x44140C: xcalloc (in /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x44145A: xzalloc (in /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x410DFF: ovsdb_monitor_changes_update (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x4110F6: ovsdb_monitor_change_cb (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x41780A: ovsdb_txn_for_each_change (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x4119A0: ovsdb_monitor_commit (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x41767C: ovsdb_txn_commit_ (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x41775D: ovsdb_txn_commit (in
/usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x419011: ovsdb_execute (in /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x414D2E: ovsdb_trigger_try (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x414A8B: ovsdb_trigger_init (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==
>
> ==23712== 124,299,840 bytes in 520,709 blocks are possibly lost in loss
> record 229 of 229
>
> ==23712==    at 0x4C2AB80: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>
> ==23712==    by 0x441487: xmalloc (in /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x40F975: clone_monitor_row_data (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x410E70: ovsdb_monitor_changes_update (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x4110F6: ovsdb_monitor_change_cb (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x41780A: ovsdb_txn_for_each_change (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x4119A0: ovsdb_monitor_commit (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x41767C: ovsdb_txn_commit_ (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x41775D: ovsdb_txn_commit (in
/usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x419011: ovsdb_execute (in /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x414D2E: ovsdb_trigger_try (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==    by 0x414A8B: ovsdb_trigger_init (in
> /usr/local/sbin/ovsdb-server)
>
> ==23712==
>
> ==23712== LEAK SUMMARY:
>
> ==23712==    definitely lost: 159 bytes in 4 blocks
>
> ==23712==    indirectly lost: 398 bytes in 13 blocks
>
> ==23712==      possibly lost: 212,065,579 bytes in 3,963,543 blocks
>
> ==23712==    still reachable: 1,156,295 bytes in 9,686 blocks
>
> ==23712==         suppressed: 0 bytes in 0 blocks
>
>
> It’s seems like about 150M memory are used by ovsdb monitor, I am looking
> into this problem, because I’m not familiar with ovsdb-ser’s source, this
> may take some time. Do you have any idea about this problem?
>
>
> BR
>
> Huang Lei
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev


Thanks Lei for reporting and the analysis! Update of the problem:

Both Andy and Lei found the bug in ovsdb monitor that when monitor is
removed the tracked changes were not dereferenced thus memory not released.

Andy provided a quick fix and I just verified it: no obvious memory
increase for ovsdb NB.

Big thanks to Andy and Lei. Looking forward to the formal patch from Andy :)

--
Best regards,
Han
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to