Re: [ovs-dev] [PATCH 2/2] ovsdb raft: Fix the problem when cluster restarted after DB compaction.

2019-12-20 Thread Ben Pfaff
On Tue, Dec 03, 2019 at 05:57:20PM -0800, Han Zhou wrote:
> Cluster doesn't work after all nodes restarted after DB compaction,
> unless there is any transaction after DB compaction before the restart.
> 
> Error log is like:
> raft|ERR|internal error: deferred vote_request message completed but not ready
> to send because message index 9 is past last synced index 0: s2 vote_request:
> term=6 last_log_index=9 last_log_term=4
> 
> The root cause is that the log_synced member is not initialized when
> reading the raft header. This patch fixes it and remove the XXX
> from the test case.
> 
> Signed-off-by: Han Zhou 

Thank you for finding this bug!  It must have been subtle.

I applied both of these patches to master and branch-2.12.

Thanks again,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/2] ovsdb raft: Fix the problem when cluster restarted after DB compaction.

2019-12-03 Thread Han Zhou
Cluster doesn't work after all nodes restarted after DB compaction,
unless there is any transaction after DB compaction before the restart.

Error log is like:
raft|ERR|internal error: deferred vote_request message completed but not ready
to send because message index 9 is past last synced index 0: s2 vote_request:
term=6 last_log_index=9 last_log_term=4

The root cause is that the log_synced member is not initialized when
reading the raft header. This patch fixes it and remove the XXX
from the test case.

Signed-off-by: Han Zhou 
---
 ovsdb/raft.c   | 2 +-
 tests/ovsdb-cluster.at | 8 
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index f354d50..4789bc4 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -849,7 +849,7 @@ raft_read_header(struct raft *raft)
 } else {
 raft_entry_clone(&raft->snap, &h.snap);
 raft->log_start = raft->log_end = h.snap_index + 1;
-raft->commit_index = h.snap_index;
+raft->log_synced = raft->commit_index = h.snap_index;
 raft->last_applied = h.snap_index - 1;
 }
 
diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at
index 79c851e..3a0bd45 100644
--- a/tests/ovsdb-cluster.at
+++ b/tests/ovsdb-cluster.at
@@ -246,14 +246,6 @@ for i in `seq $n`; do
 AT_CHECK([ovs-appctl -t "`pwd`"/s$i ovsdb-server/compact])
 done
 
-# XXX: Insert data after compact, because otherwise vote will fail after
-# cluster restart after compact. There will be error logs like:
-# raft|ERR|internal error: deferred vote_request message completed but not 
ready to send because message index 9 is past last synced index 0: s2 
vote_request: term=6 last_log_index=9 last_log_term=4
-AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest",
-  {"op": "insert",
-   "table": "simple",
-   "row": {"i": 1}}]]'], [0], [ignore], [ignore])
-
 for i in `seq $n`; do
 printf "\ns$i: stopping\n"
 OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid])
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev