Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
On 22/06/21 12:39, Vladimir Sementsov-Ogievskiy wrote: 22.06.2021 13:20, Paolo Bonzini wrote: On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote: OK, I agree, let's keep it. You can also have a finished job, but get a stale value for error_is_read or ret. The issue is not in getting the stale value per se, but in block_copy_call_status's caller not expecting it. Hmm. So, do you mean that we can read ret and error_is_read ONLY after explicitly doing load_acquire(finished) and checking that it's true? That means that we must do it not in assertion (to not be compiled out): bool finished = load_acquire() assert(finished); ... read reat and error_is_read ... In reality you must have synchronized in some other way; that outside synchronization outside block-copy.c is what guarantees that the assertion does not fail. The simplest cases are: - a mutex: "unlock" counts as release, "lock" counts as acquire; - a bottom half: "schedule" counts as release, running counts as acquire. Therefore, removing the assertion would work just fine because the synchronization would be like this: write ret/error_is_read write finished trigger bottom half or something -->bottom half runs read ret/error_is_read So there is no need at all to read ->finished, much less to load it outside the assertion. At the same time there are two problems with "assert(qatomic_read(&call_state->finished))". Note that they are not logic issues, they are maintenance issues. First, if *there is a bug elsewhere* and the above synchronization doesn't happen, it may manifest sometimes as an assertion failure and sometimes as a memory reordering. This is what I was talking about in the previous message, and it is probably the last thing that you want when debugging; since we're adding asserts defensively, we might as well do it well. Second, if somebody later carelessly changes the function to if (qatomic_read(&call_state->finished)) { ... } else { error_setg(...); } that would be broken. Using qatomic_load_acquire makes the code more future-proof against a change like the one above. Paolo
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
On 22/06/2021 12:39, Vladimir Sementsov-Ogievskiy wrote: 22.06.2021 13:20, Paolo Bonzini wrote: On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote: It does. If it returns true, you still want the load of finished to happen before the reads that follow. Hmm.. The worst case if we use just qatomic_read is that assertion will not crash when it actually should. That doesn't break the logic. But that's not good anyway. OK, I agree, let's keep it. You can also have a finished job, but get a stale value for error_is_read or ret. The issue is not in getting the stale value per se, but in block_copy_call_status's caller not expecting it. (I understand you agree, but I guess it can be interesting to learn about this too). Hmm. So, do you mean that we can read ret and error_is_read ONLY after explicitly doing load_acquire(finished) and checking that it's true? That means that we must do it not in assertion (to not be compiled out): bool finished = load_acquire() assert(finished); ... read reat and error_is_read ... If I understand correctly, this was what I was trying to say before: maybe it's better that we make sure that @finished is set before reading @ret and @error_is_read. And because assert can be disabled, we can do like you wrote above. Anyways, let's wait Paolo's answer for this. Once this is ready, I will send v5. Emanuele
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
22.06.2021 13:20, Paolo Bonzini wrote: On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote: It does. If it returns true, you still want the load of finished to happen before the reads that follow. Hmm.. The worst case if we use just qatomic_read is that assertion will not crash when it actually should. That doesn't break the logic. But that's not good anyway. OK, I agree, let's keep it. You can also have a finished job, but get a stale value for error_is_read or ret. The issue is not in getting the stale value per se, but in block_copy_call_status's caller not expecting it. (I understand you agree, but I guess it can be interesting to learn about this too). Hmm. So, do you mean that we can read ret and error_is_read ONLY after explicitly doing load_acquire(finished) and checking that it's true? That means that we must do it not in assertion (to not be compiled out): bool finished = load_acquire() assert(finished); ... read reat and error_is_read ... -- Best regards, Vladimir
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote: It does. If it returns true, you still want the load of finished to happen before the reads that follow. Hmm.. The worst case if we use just qatomic_read is that assertion will not crash when it actually should. That doesn't break the logic. But that's not good anyway. OK, I agree, let's keep it. You can also have a finished job, but get a stale value for error_is_read or ret. The issue is not in getting the stale value per se, but in block_copy_call_status's caller not expecting it. (I understand you agree, but I guess it can be interesting to learn about this too). Paolo
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
21.06.2021 12:30, Emanuele Giuseppe Esposito wrote: On 19/06/2021 22:06, Vladimir Sementsov-Ogievskiy wrote: 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote: By adding acquire/release pairs, we ensure that .ret and .error_is_read fields are written by block_copy_dirty_clusters before .finished is true. And that they are read by API user after .finished is true. The atomic here are necessary because the fields are concurrently modified also outside coroutines. To be honest, finished is modified only in coroutine. And read outside. Signed-off-by: Emanuele Giuseppe Esposito --- block/block-copy.c | 33 ++--- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/block/block-copy.c b/block/block-copy.c index 6416929abd..5348e1f61b 100644 --- a/block/block-copy.c +++ b/block/block-copy.c @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState { Coroutine *co; /* State */ - bool finished; + bool finished; /* atomic */ So, logic around finished: Thread of block_copy does: 0. finished is false 1. tasks set ret and error_is_read 2. qatomic_store_release finished -> true 3. after that point ret and error_is_read are not modified Other threads can: - qatomic_read finished, just to check are we finished or not - if finished, can read ret and error_is_read safely. If you not sure that block-copy finished, use qatomic_load_acquire() of finished first, to be sure that you read ret and error_is_read AFTER finished read and checked to be true. QemuCoSleep sleep; /* TODO: protect API with a lock */ /* To reference all call states from BlockCopyState */ QLIST_ENTRY(BlockCopyCallState) list; /* OUT parameters */ - bool cancelled; + bool cancelled; /* atomic */ Logic around cancelled is simpler: - false at start - qatomic_read is allowed from any thread - qatomic_write to true is allowed from any thread - never write to false Note that cancelling and finishing are racy. User can cancel block-copy that's already finished. We probably may improve change it, but I'm not sure that it worth doing. Still, maybe leave some comment in API documentation. /* Fields protected by lock in BlockCopyState */ bool error_is_read; int ret; @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state) assert(QEMU_IS_ALIGNED(offset, s->cluster_size)); assert(QEMU_IS_ALIGNED(bytes, s->cluster_size)); - while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) { + while (bytes && aio_task_pool_status(aio) == 0 && + !qatomic_read(&call_state->cancelled)) { BlockCopyTask *task; int64_t status_bytes; @@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state) do { ret = block_copy_dirty_clusters(call_state); - if (ret == 0 && !call_state->cancelled) { + if (ret == 0 && !qatomic_read(&call_state->cancelled)) { WITH_QEMU_LOCK_GUARD(&s->lock) { /* * Check that there is no task we still need to @@ -792,9 +793,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state) * 2. We have waited for some intersecting block-copy request * It may have failed and produced new dirty bits. */ - } while (ret > 0 && !call_state->cancelled); + } while (ret > 0 && !qatomic_read(&call_state->cancelled)); - call_state->finished = true; + qatomic_store_release(&call_state->finished, true); so, all writes to ret and error_is_read are finished to this point. if (call_state->cb) { call_state->cb(call_state->cb_opaque); @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state) return; } - assert(call_state->finished); + assert(qatomic_load_acquire(&call_state->finished)); Here we don't need load_aquire, as we don't read other fields. qatomic_read is enough. So what you say makes sense, the only thing that I wonder is: wouldn't it be better to have the acquire without assertion (or assert afterwards), just to be sure that we delete when finished is true? Hmm. I think neither compiler nor processor should reorder read structure field and free() call on the whole structure :) And anyway for block_copy_call_free() caller is responsible for the structure not being used by other thread. } bool block_copy_call_cancelled(BlockCopyCallState *call_state) { - return call_state->cancelled; + return qatomic_read(&call_state->cancelled); } int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read) { - assert(call_state->finished); + assert(qatomic_load_acquire(&call_state->finished)); Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished. Also it's misleading: if
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
22.06.2021 11:15, Paolo Bonzini wrote: On 19/06/21 22:06, Vladimir Sementsov-Ogievskiy wrote: - assert(call_state->finished); + assert(qatomic_load_acquire(&call_state->finished)); Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished. It does. If it returns true, you still want the load of finished to happen before the reads that follow. Hmm.. The worst case if we use just qatomic_read is that assertion will not crash when it actually should. That doesn't break the logic. But that's not good anyway. OK, I agree, let's keep it. Otherwise I agree with your remarks. Paolo Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed. So, let's use simple qatomic_read here too. if (error_is_read) { *error_is_read = call_state->error_is_read; } -- Best regards, Vladimir
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
On 19/06/21 22:06, Vladimir Sementsov-Ogievskiy wrote: - assert(call_state->finished); + assert(qatomic_load_acquire(&call_state->finished)); Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished. It does. If it returns true, you still want the load of finished to happen before the reads that follow. Otherwise I agree with your remarks. Paolo Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed. So, let's use simple qatomic_read here too. if (error_is_read) { *error_is_read = call_state->error_is_read; }
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
On 19/06/2021 22:06, Vladimir Sementsov-Ogievskiy wrote: 14.06.2021 10:33, Emanuele Giuseppe Esposito wrote: By adding acquire/release pairs, we ensure that .ret and .error_is_read fields are written by block_copy_dirty_clusters before .finished is true. And that they are read by API user after .finished is true. The atomic here are necessary because the fields are concurrently modified also outside coroutines. To be honest, finished is modified only in coroutine. And read outside. Signed-off-by: Emanuele Giuseppe Esposito --- block/block-copy.c | 33 ++--- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/block/block-copy.c b/block/block-copy.c index 6416929abd..5348e1f61b 100644 --- a/block/block-copy.c +++ b/block/block-copy.c @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState { Coroutine *co; /* State */ - bool finished; + bool finished; /* atomic */ So, logic around finished: Thread of block_copy does: 0. finished is false 1. tasks set ret and error_is_read 2. qatomic_store_release finished -> true 3. after that point ret and error_is_read are not modified Other threads can: - qatomic_read finished, just to check are we finished or not - if finished, can read ret and error_is_read safely. If you not sure that block-copy finished, use qatomic_load_acquire() of finished first, to be sure that you read ret and error_is_read AFTER finished read and checked to be true. QemuCoSleep sleep; /* TODO: protect API with a lock */ /* To reference all call states from BlockCopyState */ QLIST_ENTRY(BlockCopyCallState) list; /* OUT parameters */ - bool cancelled; + bool cancelled; /* atomic */ Logic around cancelled is simpler: - false at start - qatomic_read is allowed from any thread - qatomic_write to true is allowed from any thread - never write to false Note that cancelling and finishing are racy. User can cancel block-copy that's already finished. We probably may improve change it, but I'm not sure that it worth doing. Still, maybe leave some comment in API documentation. /* Fields protected by lock in BlockCopyState */ bool error_is_read; int ret; @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state) assert(QEMU_IS_ALIGNED(offset, s->cluster_size)); assert(QEMU_IS_ALIGNED(bytes, s->cluster_size)); - while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) { + while (bytes && aio_task_pool_status(aio) == 0 && + !qatomic_read(&call_state->cancelled)) { BlockCopyTask *task; int64_t status_bytes; @@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state) do { ret = block_copy_dirty_clusters(call_state); - if (ret == 0 && !call_state->cancelled) { + if (ret == 0 && !qatomic_read(&call_state->cancelled)) { WITH_QEMU_LOCK_GUARD(&s->lock) { /* * Check that there is no task we still need to @@ -792,9 +793,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state) * 2. We have waited for some intersecting block-copy request * It may have failed and produced new dirty bits. */ - } while (ret > 0 && !call_state->cancelled); + } while (ret > 0 && !qatomic_read(&call_state->cancelled)); - call_state->finished = true; + qatomic_store_release(&call_state->finished, true); so, all writes to ret and error_is_read are finished to this point. if (call_state->cb) { call_state->cb(call_state->cb_opaque); @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state) return; } - assert(call_state->finished); + assert(qatomic_load_acquire(&call_state->finished)); Here we don't need load_aquire, as we don't read other fields. qatomic_read is enough. So what you say makes sense, the only thing that I wonder is: wouldn't it be better to have the acquire without assertion (or assert afterwards), just to be sure that we delete when finished is true? [...] } bool block_copy_call_cancelled(BlockCopyCallState *call_state) { - return call_state->cancelled; + return qatomic_read(&call_state->cancelled); } int block_copy_call_status(BlockCopyCallState *call_state, bool *error_is_read) { - assert(call_state->finished); + assert(qatomic_load_acquire(&call_state->finished)); Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not yet finished anyway). So, caller is double sure that block-copy is finished. Also it's misleading: if we think that it do some protection, we are doing wrong thing: assertions may be simply compiled out, we can't rely on statements inside assert() to be executed. So, let's use simple qatomic_read here too. Same applies here. if (error_is_read) {
Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState
14.06.2021 10:33, Emanuele Giuseppe Esposito wrote: By adding acquire/release pairs, we ensure that .ret and .error_is_read fields are written by block_copy_dirty_clusters before .finished is true. And that they are read by API user after .finished is true. The atomic here are necessary because the fields are concurrently modified also outside coroutines. To be honest, finished is modified only in coroutine. And read outside. Signed-off-by: Emanuele Giuseppe Esposito --- block/block-copy.c | 33 ++--- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/block/block-copy.c b/block/block-copy.c index 6416929abd..5348e1f61b 100644 --- a/block/block-copy.c +++ b/block/block-copy.c @@ -53,14 +53,14 @@ typedef struct BlockCopyCallState { Coroutine *co; /* State */ -bool finished; +bool finished; /* atomic */ So, logic around finished: Thread of block_copy does: 0. finished is false 1. tasks set ret and error_is_read 2. qatomic_store_release finished -> true 3. after that point ret and error_is_read are not modified Other threads can: - qatomic_read finished, just to check are we finished or not - if finished, can read ret and error_is_read safely. If you not sure that block-copy finished, use qatomic_load_acquire() of finished first, to be sure that you read ret and error_is_read AFTER finished read and checked to be true. QemuCoSleep sleep; /* TODO: protect API with a lock */ /* To reference all call states from BlockCopyState */ QLIST_ENTRY(BlockCopyCallState) list; /* OUT parameters */ -bool cancelled; +bool cancelled; /* atomic */ Logic around cancelled is simpler: - false at start - qatomic_read is allowed from any thread - qatomic_write to true is allowed from any thread - never write to false Note that cancelling and finishing are racy. User can cancel block-copy that's already finished. We probably may improve change it, but I'm not sure that it worth doing. Still, maybe leave some comment in API documentation. /* Fields protected by lock in BlockCopyState */ bool error_is_read; int ret; @@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state) assert(QEMU_IS_ALIGNED(offset, s->cluster_size)); assert(QEMU_IS_ALIGNED(bytes, s->cluster_size)); -while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) { +while (bytes && aio_task_pool_status(aio) == 0 && + !qatomic_read(&call_state->cancelled)) { BlockCopyTask *task; int64_t status_bytes; @@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state) do { ret = block_copy_dirty_clusters(call_state); -if (ret == 0 && !call_state->cancelled) { +if (ret == 0 && !qatomic_read(&call_state->cancelled)) { WITH_QEMU_LOCK_GUARD(&s->lock) { /* * Check that there is no task we still need to @@ -792,9 +793,9 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state) * 2. We have waited for some intersecting block-copy request *It may have failed and produced new dirty bits. */ -} while (ret > 0 && !call_state->cancelled); +} while (ret > 0 && !qatomic_read(&call_state->cancelled)); -call_state->finished = true; +qatomic_store_release(&call_state->finished, true); so, all writes to ret and error_is_read are finished to this point. if (call_state->cb) { call_state->cb(call_state->cb_opaque); @@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state) return; } -assert(call_state->finished); +assert(qatomic_load_acquire(&call_state->finished)); Here we don't need load_aquire, as we don't read other fields. qatomic_read is enough. g_free(call_state); } bool block_copy_call_finished(BlockCopyCallState *call_state) { -return call_state->finished; +return qatomic_load_acquire(&call_state->finished); and here } bool block_copy_call_succeeded(BlockCopyCallState *call_state) { -return call_state->finished && !call_state->cancelled && -call_state->ret == 0; +return qatomic_load_acquire(&call_state->finished) && + !qatomic_read(&call_state->cancelled) && + call_state->ret == 0; Here qatomic_load_acquire() is reasonable: it protects read of ->ret } bool block_copy_call_failed(BlockCopyCallState *call_state) { -return call_state->finished && !call_state->cancelled && -call_state->ret < 0; +return qatomic_load_acquire(&call_state->finished) && + !qatomic_read(&call_state->cancelled) && + call_state->ret < 0; Here reasonable. } bool block_copy_call_cancelled(BlockCopyCallState *call_state) { -return call_state->cancelled; +return