On Wed, Feb 27, 2019 at 5:26 PM Ben Pfaff <b...@ovn.org> wrote:
>
> On Mon, Feb 25, 2019 at 09:25:03AM -0800, Han Zhou wrote:
> > In scalability test with ovn-scale-test, ovsdb-server SB load is not a
> > problem at least with 1k HVs. However, if we restart the ovsdb-server,
> > depending on the number of HVs and scale of logical objects, e.g. the
> > number of logical ports, ovsdb-server of SB become an obvious bottleneck.
> >
> > In our test with 1k HVs and 20k logical ports (200 lport * 100 lswitches
> > connected by one single logical router). Restarting ovsdb-server of SB
> > resulted in 100% CPU of ovsdb-server for more than 1 hour. All HVs (and
> > northd) are reconnecting and resyncing the big amount of data at the same
> > time.
> >
> > Similar problem would happen in failover scenario. With active-active
> > cluster, the problem can be aleviated slightly, because only 1/3 (assuming
> > it is 3-node cluster) of the HVs will need to resync data from new servers,
> > but it is still a serious problem.
> >
> > For detailed discussions for the problem and solutions, see:
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2018-October/047591.html
>
> Thanks.
>
> When I apply this series, I get a reproducible test failure in test
> 1920 "schema conversion online - clustered".  It's an error from Address
> Sanitizer.  I'm appending the testsuite.log.
>

Thanks Ben for catching this. I should enable AddressSanitizer for
regression tests. Please find below patch that fixes this bug. I will
send v4 with this fixing the patch 2/5: ovsdb-server: Transaction
history tracking.

----8><----------------------------------------------------><8----
diff --git a/ovsdb/ovsdb.c b/ovsdb/ovsdb.c
index ea7dd23..cfc96b3 100644
--- a/ovsdb/ovsdb.c
+++ b/ovsdb/ovsdb.c
@@ -538,6 +538,9 @@ ovsdb_replace(struct ovsdb *dst, struct ovsdb *src)
         ovsdb_trigger_prereplace_db(trigger);
     }

+    /* Destroy txn history. */
+    ovsdb_txn_history_destroy(dst);
+
     struct ovsdb_schema *tmp_schema = dst->schema;
     dst->schema = src->schema;
     src->schema = tmp_schema;
diff --git a/ovsdb/transaction.c b/ovsdb/transaction.c
index 0081840..b3f4946 100644
--- a/ovsdb/transaction.c
+++ b/ovsdb/transaction.c
@@ -1415,7 +1415,9 @@ ovsdb_txn_history_destroy(struct ovsdb *db)

     struct ovsdb_txn_history_node *txn_h_node, *next;
     LIST_FOR_EACH_SAFE (txn_h_node, next, node, &db->txn_history) {
+        ovs_list_remove(&txn_h_node->node);
         ovsdb_txn_destroy_cloned(txn_h_node->txn);
         free(txn_h_node);
     }
+    db->n_txn_history = 0;
 }
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to