On Mon, Mar 07, 2016 at 01:57:22PM -0800, Han Zhou wrote: > On Sun, Mar 6, 2016 at 11:02 PM, Lei Huang <huang.f....@gmail.com> wrote: > > > > Hi, > > > > > > During a scalability test, we found that the ovn northbound ovsdb-server’s > > memory usage becomes very high while creating and binding ports, the test > > step is: > > > > 1. Create 1000 sandboxes > > > > 2. Create 5 lswitches and create 200 lports for each of lswitches, on > > northbound > > > > 3. For each of lport created in step 2, bind it to a sandboxes randomly, > > then use ‘ovn-nbctl wait-until Logical_Port <port-name> up=true’ to wait > > the port is ‘up’ on northbound > > > > 4. Goto step 2 > > > > > > The VmRSS memory usages as below: > > > > > > lport number ovn-northd nb ovsdb-server sb ovsdb-server > > > > 0 1292 1124 7104 > > > > 5000 21704 6476420 17424 > > > > 10000 41680 25446148 31808 > > > > 15000 61600 56893928 133864 > > > > 20000 81844 100955012 176868 > > > > > > The memory log as below: > > > > 016-03-02T09:48:32.024Z|00003|memory|INFO|1124 kB peak resident set size > > after 10.0 seconds > > > > 2016-03-02T16:24:46.430Z|00005|memory|INFO|peak resident set size grew > 150% > > in last 23774.4 seconds, from 1124 kB to 2808 kB > > > > 2016-03-02T16:25:06.432Z|00007|memory|INFO|peak resident set size grew > 130% > > in last 20.0 seconds, from 2808 kB to 6448 kB > > > > 2016-03-02T16:25:16.433Z|00009|memory|INFO|peak resident set size grew 54% > > in last 10.0 seconds, from 6448 kB to 9948 kB > > > > 2016-03-02T16:25:26.434Z|00011|memory|INFO|peak resident set size grew 52% > > in last 10.0 seconds, from 9948 kB to 15072 kB > > > > 2016-03-02T16:25:36.436Z|00013|memory|INFO|peak resident set size grew > 178% > > in last 10.0 seconds, from 15072 kB to 41956 kB > > > > 2016-03-02T16:26:36.439Z|00015|memory|INFO|peak resident set size grew > 140% > > in last 60.0 seconds, from 41956 kB to 100696 kB > > > > 2016-03-02T16:27:36.444Z|00017|memory|INFO|peak resident set size grew 78% > > in last 60.0 seconds, from 100696 kB to 179108 kB > > > > 2016-03-02T16:28:46.444Z|00019|memory|INFO|peak resident set size grew 55% > > in last 70.0 seconds, from 179108 kB to 277412 kB > > > > 2016-03-02T16:39:26.953Z|00021|memory|INFO|peak resident set size grew 50% > > in last 640.5 seconds, from 277412 kB to 417472 kB > > > > 2016-03-02T16:41:06.960Z|00023|memory|INFO|peak resident set size grew 67% > > in last 100.0 seconds, from 417472 kB to 696732 kB > > > > 2016-03-02T16:44:17.180Z|00025|memory|INFO|peak resident set size grew 54% > > in last 190.2 seconds, from 696732 kB to 1071080 kB > > > > 2016-03-02T16:55:02.317Z|00027|memory|INFO|peak resident set size grew 68% > > in last 645.1 seconds, from 1071080 kB to 1794276 kB > > > > 2016-03-02T17:03:22.544Z|00029|memory|INFO|peak resident set size grew 50% > > in last 500.2 seconds, from 1794276 kB to 2698100 kB > > > > 2016-03-02T17:12:18.359Z|00031|memory|INFO|peak resident set size grew 54% > > in last 535.8 seconds, from 2698100 kB to 4158336 kB > > > > 2016-03-02T17:27:59.897Z|00070|memory|INFO|peak resident set size grew 55% > > in last 941.5 seconds, from 4158336 kB to 6458136 kB > > > > 2016-03-02T17:51:04.005Z|00136|memory|INFO|peak resident set size grew 53% > > in last 1384.1 seconds, from 6458136 kB to 9896844 kB > > > > 2016-03-02T18:37:52.913Z|00243|memory|INFO|peak resident set size grew 57% > > in last 2808.9 seconds, from 9896844 kB to 15539132 kB > > > > 2016-03-02T19:21:33.977Z|00414|memory|INFO|peak resident set size grew 51% > > in last 2621.1 seconds, from 15539132 kB to 23403744 kB > > > > 2016-03-02T20:27:53.938Z|00661|memory|INFO|peak resident set size grew 51% > > in last 3980.0 seconds, from 23403744 kB to 35348044 kB > > > > 2016-03-02T22:11:13.200Z|00957|memory|INFO|peak resident set size grew 53% > > in last 6199.3 seconds, from 35348044 kB to 53943084 kB > > > > 2016-03-03T00:49:32.196Z|01306|memory|INFO|peak resident set size grew 51% > > in last 9499.0 seconds, from 53943084 kB to 81658988 kB > > > > > > > > On another small test environment, I checked northbound ovsdb-server by > > valgrind: > > > > $ valgrind --leak-check=full --show-leak-kinds=all ovsdb-server --no-chdir > > --pidfile=ovsdb-server-nb.pid --unixctl=ovsdb-server-nb.ctl -vconsole:off > > -vsyslog:off -vfile:info --log-file=ovsdb-server-nb.log > > --remote=punix:/home/rally/controller-sandbox/db-nb.sock conf-nb.db > ovnnb.db > > > > > > But no obvious memory leak found. So I make ovsdb-server exit without > > cleanup to see what are memory used for, code changed as below: > > > > > > diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c > > > > index fa662b1..4498eaa 100644 > > > > --- a/ovsdb/ovsdb-server.c > > > > +++ b/ovsdb/ovsdb-server.c > > > > @@ -339,7 +339,7 @@ main(int argc, char *argv[]) > > > > ovsdb_server_disable_monitor2, jsonrpc); > > > > > > > > main_loop(jsonrpc, &all_dbs, unixctl, &remotes, run_process, > &exiting); > > > > - > > > > + return 0; > > > > ovsdb_jsonrpc_server_destroy(jsonrpc); > > > > SHASH_FOR_EACH_SAFE(node, next, &all_dbs) { > > > > struct db *db = node->data; > > > > > > After 100 port created and bound, the memory used is about 200M, and the > > output of valgrind as below : > > > > ==23712== 24,994,032 bytes in 520,709 blocks are possibly lost in loss > > record 228 of 229 > > > > ==23712== at 0x4C2CC70: calloc (in > > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > > > > ==23712== by 0x44140C: xcalloc (in /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x44145A: xzalloc (in /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x410DFF: ovsdb_monitor_changes_update (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x4110F6: ovsdb_monitor_change_cb (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x41780A: ovsdb_txn_for_each_change (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x4119A0: ovsdb_monitor_commit (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x41767C: ovsdb_txn_commit_ (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x41775D: ovsdb_txn_commit (in > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x419011: ovsdb_execute (in /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x414D2E: ovsdb_trigger_try (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x414A8B: ovsdb_trigger_init (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== > > > > ==23712== 124,299,840 bytes in 520,709 blocks are possibly lost in loss > > record 229 of 229 > > > > ==23712== at 0x4C2AB80: malloc (in > > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > > > > ==23712== by 0x441487: xmalloc (in /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x40F975: clone_monitor_row_data (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x410E70: ovsdb_monitor_changes_update (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x4110F6: ovsdb_monitor_change_cb (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x41780A: ovsdb_txn_for_each_change (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x4119A0: ovsdb_monitor_commit (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x41767C: ovsdb_txn_commit_ (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x41775D: ovsdb_txn_commit (in > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x419011: ovsdb_execute (in /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x414D2E: ovsdb_trigger_try (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== by 0x414A8B: ovsdb_trigger_init (in > > /usr/local/sbin/ovsdb-server) > > > > ==23712== > > > > ==23712== LEAK SUMMARY: > > > > ==23712== definitely lost: 159 bytes in 4 blocks > > > > ==23712== indirectly lost: 398 bytes in 13 blocks > > > > ==23712== possibly lost: 212,065,579 bytes in 3,963,543 blocks > > > > ==23712== still reachable: 1,156,295 bytes in 9,686 blocks > > > > ==23712== suppressed: 0 bytes in 0 blocks > > > > > > It’s seems like about 150M memory are used by ovsdb monitor, I am looking > > into this problem, because I’m not familiar with ovsdb-ser’s source, this > > may take some time. Do you have any idea about this problem? > > > > > > BR > > > > Huang Lei > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > http://openvswitch.org/mailman/listinfo/dev > > > Thanks Lei for reporting and the analysis! Update of the problem: > > Both Andy and Lei found the bug in ovsdb monitor that when monitor is > removed the tracked changes were not dereferenced thus memory not released. > > Andy provided a quick fix and I just verified it: no obvious memory > increase for ovsdb NB. > > Big thanks to Andy and Lei. Looking forward to the formal patch from Andy :)
That's great news. Thanks Andy and Lei! _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev