Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
On Fri, Oct 07, 2016 at 02:21:33PM +0200, Paolo Bonzini wrote: > Without this change, there is a race condition in tests/test-replication. > Depending on how fast the failover job (active commit) runs, there is a > chance of two bad things happening: > > 1) replication_done can be called after the secondary has been closed > and hence when the BDRVReplicationState is not valid anymore. > > 2) two copies of the active disk are present during the > /replication/secondary/stop test (that test runs immediately after > /replication/secondary/start, which tests failover). This causes the > corruption detector to fire. > > Signed-off-by: Paolo Bonzini > --- > block/replication.c | 3 +++ > 1 file changed, 3 insertions(+) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
On 10/07/2016 08:21 PM, Paolo Bonzini wrote: Without this change, there is a race condition in tests/test-replication. Depending on how fast the failover job (active commit) runs, there is a chance of two bad things happening: 1) replication_done can be called after the secondary has been closed and hence when the BDRVReplicationState is not valid anymore. 2) two copies of the active disk are present during the /replication/secondary/stop test (that test runs immediately after /replication/secondary/start, which tests failover). This causes the corruption detector to fire. Signed-off-by: Paolo Bonzini Reviewed-by: Changlong Xie --- block/replication.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/replication.c b/block/replication.c index 3bd1cf1..5231a00 100644 --- a/block/replication.c +++ b/block/replication.c @@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs) if (s->replication_state == BLOCK_REPLICATION_RUNNING) { replication_stop(s->rs, false, NULL); } +if (s->replication_state == BLOCK_REPLICATION_FAILOVER) { +block_job_cancel_sync(s->active_disk->bs->job); +} if (s->mode == REPLICATION_MODE_SECONDARY) { g_free(s->top_id);
Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
At 2016/10/7 20:21, Paolo Bonzini wrote: Without this change, there is a race condition in tests/test-replication. Depending on how fast the failover job (active commit) runs, there is a chance of two bad things happening: 1) replication_done can be called after the secondary has been closed and hence when the BDRVReplicationState is not valid anymore. 2) two copies of the active disk are present during the /replication/secondary/stop test (that test runs immediately after /replication/secondary/start, which tests failover). This causes the corruption detector to fire. Signed-off-by: Paolo Bonzini This patch looks fine to me. Reviewed-by: Wen Congyang --- block/replication.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/replication.c b/block/replication.c index 3bd1cf1..5231a00 100644 --- a/block/replication.c +++ b/block/replication.c @@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs) if (s->replication_state == BLOCK_REPLICATION_RUNNING) { replication_stop(s->rs, false, NULL); } +if (s->replication_state == BLOCK_REPLICATION_FAILOVER) { +block_job_cancel_sync(s->active_disk->bs->job); +} if (s->mode == REPLICATION_MODE_SECONDARY) { g_free(s->top_id);
[Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
Without this change, there is a race condition in tests/test-replication. Depending on how fast the failover job (active commit) runs, there is a chance of two bad things happening: 1) replication_done can be called after the secondary has been closed and hence when the BDRVReplicationState is not valid anymore. 2) two copies of the active disk are present during the /replication/secondary/stop test (that test runs immediately after /replication/secondary/start, which tests failover). This causes the corruption detector to fire. Signed-off-by: Paolo Bonzini --- block/replication.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/replication.c b/block/replication.c index 3bd1cf1..5231a00 100644 --- a/block/replication.c +++ b/block/replication.c @@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs) if (s->replication_state == BLOCK_REPLICATION_RUNNING) { replication_stop(s->rs, false, NULL); } +if (s->replication_state == BLOCK_REPLICATION_FAILOVER) { +block_job_cancel_sync(s->active_disk->bs->job); +} if (s->mode == REPLICATION_MODE_SECONDARY) { g_free(s->top_id); -- 2.7.4