Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-23 Thread Paolo Bonzini

On 22/06/21 12:39, Vladimir Sementsov-Ogievskiy wrote:

22.06.2021 13:20, Paolo Bonzini wrote:

On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:

OK, I agree, let's keep it.


You can also have a finished job, but get a stale value for 
error_is_read or ret.  The issue is not in getting the stale value per 
se, but in block_copy_call_status's caller not expecting it.


Hmm. So, do you mean that we can read ret and error_is_read ONLY after 
explicitly doing load_acquire(finished) and checking that it's true?


That means that we must do it not in assertion (to not be compiled out):

bool finished = load_acquire()

assert(finished);

... read reat and error_is_read ...


In reality you must have synchronized in some other way; that outside 
synchronization outside block-copy.c is what guarantees that the 
assertion does not fail.  The simplest cases are:


- a mutex: "unlock" counts as release, "lock" counts as acquire;

- a bottom half: "schedule" counts as release, running counts as acquire.

Therefore, removing the assertion would work just fine because the 
synchronization would be like this:


   write ret/error_is_read
   write finished
   trigger bottom half or something -->bottom half runs
   read ret/error_is_read

So there is no need at all to read ->finished, much less to load it 
outside the assertion.  At the same time there are two problems with 
"assert(qatomic_read(&call_state->finished))".  Note that they are not 
logic issues, they are maintenance issues.


First, if *there is a bug elsewhere* and the above synchronization 
doesn't happen, it may manifest sometimes as an assertion failure and 
sometimes as a memory reordering.  This is what I was talking about in 
the previous message, and it is probably the last thing that you want 
when debugging; since we're adding asserts defensively, we might as well 
do it well.


Second, if somebody later carelessly changes the function to

  if (qatomic_read(&call_state->finished)) {
  ...
  } else {
  error_setg(...);
  }

that would be broken.  Using qatomic_load_acquire makes the code more 
future-proof against a change like the one above.


Paolo




Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-22 Thread Emanuele Giuseppe Esposito




On 22/06/2021 12:39, Vladimir Sementsov-Ogievskiy wrote:

22.06.2021 13:20, Paolo Bonzini wrote:

On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
It does.  If it returns true, you still want the load of finished to 
happen before the reads that follow.


Hmm.. The worst case if we use just qatomic_read is that assertion 
will not crash when it actually should. That doesn't break the logic. 
But that's not good anyway.


OK, I agree, let's keep it.


You can also have a finished job, but get a stale value for 
error_is_read or ret.  The issue is not in getting the stale value per 
se, but in block_copy_call_status's caller not expecting it.


(I understand you agree, but I guess it can be interesting to learn 
about this too).




Hmm. So, do you mean that we can read ret and error_is_read ONLY after 
explicitly doing load_acquire(finished) and checking that it's true?


That means that we must do it not in assertion (to not be compiled out):

bool finished = load_acquire()

assert(finished);

... read reat and error_is_read ...




If I understand correctly, this was what I was trying to say before: 
maybe it's better that we make sure that @finished is set before reading 
@ret and @error_is_read. And because assert can be disabled, we can do 
like you wrote above.


Anyways, let's wait Paolo's answer for this. Once this is ready, I will 
send v5.


Emanuele




Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-22 Thread Vladimir Sementsov-Ogievskiy

22.06.2021 13:20, Paolo Bonzini wrote:

On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:

It does.  If it returns true, you still want the load of finished to happen 
before the reads that follow.


Hmm.. The worst case if we use just qatomic_read is that assertion will not 
crash when it actually should. That doesn't break the logic. But that's not 
good anyway.

OK, I agree, let's keep it.


You can also have a finished job, but get a stale value for error_is_read or 
ret.  The issue is not in getting the stale value per se, but in 
block_copy_call_status's caller not expecting it.

(I understand you agree, but I guess it can be interesting to learn about this 
too).



Hmm. So, do you mean that we can read ret and error_is_read ONLY after 
explicitly doing load_acquire(finished) and checking that it's true?

That means that we must do it not in assertion (to not be compiled out):

bool finished = load_acquire()

assert(finished);

... read reat and error_is_read ...


--
Best regards,
Vladimir



Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-22 Thread Paolo Bonzini

On 22/06/21 11:36, Vladimir Sementsov-Ogievskiy wrote:
It does.  If it returns true, you still want the load of finished to 
happen before the reads that follow.


Hmm.. The worst case if we use just qatomic_read is that assertion will 
not crash when it actually should. That doesn't break the logic. But 
that's not good anyway.


OK, I agree, let's keep it.


You can also have a finished job, but get a stale value for 
error_is_read or ret.  The issue is not in getting the stale value per 
se, but in block_copy_call_status's caller not expecting it.


(I understand you agree, but I guess it can be interesting to learn 
about this too).


Paolo




Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-22 Thread Vladimir Sementsov-Ogievskiy

21.06.2021 12:30, Emanuele Giuseppe Esposito wrote:



On 19/06/2021 22:06, Vladimir Sementsov-Ogievskiy wrote:

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:

By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true.


And that they are read by API user after .finished is true.



The atomic here are necessary because the fields are concurrently modified
also outside coroutines.


To be honest, finished is modified only in coroutine. And read outside.



Signed-off-by: Emanuele Giuseppe Esposito 
---
  block/block-copy.c | 33 ++---
  1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 6416929abd..5348e1f61b 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
  Coroutine *co;
  /* State */
-    bool finished;
+    bool finished; /* atomic */


So, logic around finished:

Thread of block_copy does:
0. finished is false
1. tasks set ret and error_is_read
2. qatomic_store_release finished -> true
3. after that point ret and error_is_read are not modified

Other threads can:

- qatomic_read finished, just to check are we finished or not

- if finished, can read ret and error_is_read safely. If you not sure that 
block-copy finished, use qatomic_load_acquire() of finished first, to be sure 
that you read ret and error_is_read AFTER finished read and checked to be true.


  QemuCoSleep sleep; /* TODO: protect API with a lock */
  /* To reference all call states from BlockCopyState */
  QLIST_ENTRY(BlockCopyCallState) list;
  /* OUT parameters */
-    bool cancelled;
+    bool cancelled; /* atomic */


Logic around cancelled is simpler:

- false at start

- qatomic_read is allowed from any thread

- qatomic_write to true is allowed from any thread

- never write to false

Note that cancelling and finishing are racy. User can cancel block-copy that's 
already finished. We probably may improve change it, but I'm not sure that it 
worth doing. Still, maybe leave some comment in API documentation.


  /* Fields protected by lock in BlockCopyState */
  bool error_is_read;
  int ret;
@@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
  assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
  assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
-    while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {
+    while (bytes && aio_task_pool_status(aio) == 0 &&
+   !qatomic_read(&call_state->cancelled)) {
  BlockCopyTask *task;
  int64_t status_bytes;
@@ -761,7 +762,7 @@ static int coroutine_fn 
block_copy_common(BlockCopyCallState *call_state)
  do {
  ret = block_copy_dirty_clusters(call_state);
-    if (ret == 0 && !call_state->cancelled) {
+    if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
  WITH_QEMU_LOCK_GUARD(&s->lock) {
  /*
   * Check that there is no task we still need to
@@ -792,9 +793,9 @@ static int coroutine_fn 
block_copy_common(BlockCopyCallState *call_state)
   * 2. We have waited for some intersecting block-copy request
   *    It may have failed and produced new dirty bits.
   */
-    } while (ret > 0 && !call_state->cancelled);
+    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
-    call_state->finished = true;
+    qatomic_store_release(&call_state->finished, true);


so, all writes to ret and error_is_read are finished to this point.


  if (call_state->cb) {
  call_state->cb(call_state->cb_opaque);
@@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
  return;
  }
-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));


Here we don't need load_aquire, as we don't read other fields. qatomic_read is 
enough.


So what you say makes sense, the only thing that I wonder is: wouldn't it be 
better to have the acquire without assertion (or assert afterwards), just to be 
sure that we delete when finished is true?



Hmm. I think neither compiler nor processor should reorder read structure field 
and free() call on the whole structure :)

And anyway for block_copy_call_free() caller is responsible for the structure 
not being used by other thread.






  }
  bool block_copy_call_cancelled(BlockCopyCallState *call_state)
  {
-    return call_state->cancelled;
+    return qatomic_read(&call_state->cancelled);
  }
  int block_copy_call_status(BlockCopyCallState *call_state, bool 
*error_is_read)
  {
-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));


Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not 
yet finished anyway). So, caller is double sure that block-copy is finished.

Also it's misleading: if 

Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-22 Thread Vladimir Sementsov-Ogievskiy

22.06.2021 11:15, Paolo Bonzini wrote:

On 19/06/21 22:06, Vladimir Sementsov-Ogievskiy wrote:


-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));


Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if not 
yet finished anyway). So, caller is double sure that block-copy is finished.


It does.  If it returns true, you still want the load of finished to happen 
before the reads that follow.



Hmm.. The worst case if we use just qatomic_read is that assertion will not 
crash when it actually should. That doesn't break the logic. But that's not 
good anyway.

OK, I agree, let's keep it.


Otherwise I agree with your remarks.

Paolo


Also it's misleading: if we think that it do some protection, we are doing 
wrong thing: assertions may be simply compiled out, we can't rely on statements 
inside assert() to be executed.

So, let's use simple qatomic_read here too.


  if (error_is_read) {
  *error_is_read = call_state->error_is_read;
  } 





--
Best regards,
Vladimir



Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-22 Thread Paolo Bonzini

On 19/06/21 22:06, Vladimir Sementsov-Ogievskiy wrote:


-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));


Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if 
not yet finished anyway). So, caller is double sure that block-copy is 
finished.


It does.  If it returns true, you still want the load of finished to 
happen before the reads that follow.


Otherwise I agree with your remarks.

Paolo

Also it's misleading: if we think that it do some protection, we are 
doing wrong thing: assertions may be simply compiled out, we can't rely 
on statements inside assert() to be executed.


So, let's use simple qatomic_read here too.


  if (error_is_read) {
  *error_is_read = call_state->error_is_read;
  } 





Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-21 Thread Emanuele Giuseppe Esposito




On 19/06/2021 22:06, Vladimir Sementsov-Ogievskiy wrote:

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:

By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true.


And that they are read by API user after .finished is true.



The atomic here are necessary because the fields are concurrently 
modified

also outside coroutines.


To be honest, finished is modified only in coroutine. And read outside.



Signed-off-by: Emanuele Giuseppe Esposito 
---
  block/block-copy.c | 33 ++---
  1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 6416929abd..5348e1f61b 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
  Coroutine *co;
  /* State */
-    bool finished;
+    bool finished; /* atomic */


So, logic around finished:

Thread of block_copy does:
0. finished is false
1. tasks set ret and error_is_read
2. qatomic_store_release finished -> true
3. after that point ret and error_is_read are not modified

Other threads can:

- qatomic_read finished, just to check are we finished or not

- if finished, can read ret and error_is_read safely. If you not sure 
that block-copy finished, use qatomic_load_acquire() of finished first, 
to be sure that you read ret and error_is_read AFTER finished read and 
checked to be true.



  QemuCoSleep sleep; /* TODO: protect API with a lock */
  /* To reference all call states from BlockCopyState */
  QLIST_ENTRY(BlockCopyCallState) list;
  /* OUT parameters */
-    bool cancelled;
+    bool cancelled; /* atomic */


Logic around cancelled is simpler:

- false at start

- qatomic_read is allowed from any thread

- qatomic_write to true is allowed from any thread

- never write to false

Note that cancelling and finishing are racy. User can cancel block-copy 
that's already finished. We probably may improve change it, but I'm not 
sure that it worth doing. Still, maybe leave some comment in API 
documentation.



  /* Fields protected by lock in BlockCopyState */
  bool error_is_read;
  int ret;
@@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState 
*call_state)

  assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
  assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
-    while (bytes && aio_task_pool_status(aio) == 0 && 
!call_state->cancelled) {

+    while (bytes && aio_task_pool_status(aio) == 0 &&
+   !qatomic_read(&call_state->cancelled)) {
  BlockCopyTask *task;
  int64_t status_bytes;
@@ -761,7 +762,7 @@ static int coroutine_fn 
block_copy_common(BlockCopyCallState *call_state)

  do {
  ret = block_copy_dirty_clusters(call_state);
-    if (ret == 0 && !call_state->cancelled) {
+    if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
  WITH_QEMU_LOCK_GUARD(&s->lock) {
  /*
   * Check that there is no task we still need to
@@ -792,9 +793,9 @@ static int coroutine_fn 
block_copy_common(BlockCopyCallState *call_state)

   * 2. We have waited for some intersecting block-copy request
   *    It may have failed and produced new dirty bits.
   */
-    } while (ret > 0 && !call_state->cancelled);
+    } while (ret > 0 && !qatomic_read(&call_state->cancelled));
-    call_state->finished = true;
+    qatomic_store_release(&call_state->finished, true);


so, all writes to ret and error_is_read are finished to this point.


  if (call_state->cb) {
  call_state->cb(call_state->cb_opaque);
@@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState 
*call_state)

  return;
  }
-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));


Here we don't need load_aquire, as we don't read other fields. 
qatomic_read is enough.


So what you say makes sense, the only thing that I wonder is: wouldn't 
it be better to have the acquire without assertion (or assert 
afterwards), just to be sure that we delete when finished is true?


[...]




  }
  bool block_copy_call_cancelled(BlockCopyCallState *call_state)
  {
-    return call_state->cancelled;
+    return qatomic_read(&call_state->cancelled);
  }
  int block_copy_call_status(BlockCopyCallState *call_state, bool 
*error_is_read)

  {
-    assert(call_state->finished);
+    assert(qatomic_load_acquire(&call_state->finished));


Hmm. Here qatomic_load_acquire protects nothing (assertion will crash if 
not yet finished anyway). So, caller is double sure that block-copy is 
finished.


Also it's misleading: if we think that it do some protection, we are 
doing wrong thing: assertions may be simply compiled out, we can't rely 
on statements inside assert() to be executed.


So, let's use simple qatomic_read here too.


Same applies here.




  if (error_is_read) {
 

Re: [PATCH v4 6/6] block-copy: atomic .cancelled and .finished fields in BlockCopyCallState

2021-06-19 Thread Vladimir Sementsov-Ogievskiy

14.06.2021 10:33, Emanuele Giuseppe Esposito wrote:

By adding acquire/release pairs, we ensure that .ret and .error_is_read
fields are written by block_copy_dirty_clusters before .finished is true.


And that they are read by API user after .finished is true.



The atomic here are necessary because the fields are concurrently modified
also outside coroutines.


To be honest, finished is modified only in coroutine. And read outside.



Signed-off-by: Emanuele Giuseppe Esposito 
---
  block/block-copy.c | 33 ++---
  1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 6416929abd..5348e1f61b 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -53,14 +53,14 @@ typedef struct BlockCopyCallState {
  Coroutine *co;
  
  /* State */

-bool finished;
+bool finished; /* atomic */


So, logic around finished:

Thread of block_copy does:
0. finished is false
1. tasks set ret and error_is_read
2. qatomic_store_release finished -> true
3. after that point ret and error_is_read are not modified

Other threads can:

- qatomic_read finished, just to check are we finished or not

- if finished, can read ret and error_is_read safely. If you not sure that 
block-copy finished, use qatomic_load_acquire() of finished first, to be sure 
that you read ret and error_is_read AFTER finished read and checked to be true.


  QemuCoSleep sleep; /* TODO: protect API with a lock */
  
  /* To reference all call states from BlockCopyState */

  QLIST_ENTRY(BlockCopyCallState) list;
  
  /* OUT parameters */

-bool cancelled;
+bool cancelled; /* atomic */


Logic around cancelled is simpler:

- false at start

- qatomic_read is allowed from any thread

- qatomic_write to true is allowed from any thread

- never write to false

Note that cancelling and finishing are racy. User can cancel block-copy that's 
already finished. We probably may improve change it, but I'm not sure that it 
worth doing. Still, maybe leave some comment in API documentation.


  /* Fields protected by lock in BlockCopyState */
  bool error_is_read;
  int ret;
@@ -650,7 +650,8 @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
  assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
  assert(QEMU_IS_ALIGNED(bytes, s->cluster_size));
  
-while (bytes && aio_task_pool_status(aio) == 0 && !call_state->cancelled) {

+while (bytes && aio_task_pool_status(aio) == 0 &&
+   !qatomic_read(&call_state->cancelled)) {
  BlockCopyTask *task;
  int64_t status_bytes;
  
@@ -761,7 +762,7 @@ static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)

  do {
  ret = block_copy_dirty_clusters(call_state);
  
-if (ret == 0 && !call_state->cancelled) {

+if (ret == 0 && !qatomic_read(&call_state->cancelled)) {
  WITH_QEMU_LOCK_GUARD(&s->lock) {
  /*
   * Check that there is no task we still need to
@@ -792,9 +793,9 @@ static int coroutine_fn 
block_copy_common(BlockCopyCallState *call_state)
   * 2. We have waited for some intersecting block-copy request
   *It may have failed and produced new dirty bits.
   */
-} while (ret > 0 && !call_state->cancelled);
+} while (ret > 0 && !qatomic_read(&call_state->cancelled));
  
-call_state->finished = true;

+qatomic_store_release(&call_state->finished, true);


so, all writes to ret and error_is_read are finished to this point.

  
  if (call_state->cb) {

  call_state->cb(call_state->cb_opaque);
@@ -857,35 +858,37 @@ void block_copy_call_free(BlockCopyCallState *call_state)
  return;
  }
  
-assert(call_state->finished);

+assert(qatomic_load_acquire(&call_state->finished));


Here we don't need load_aquire, as we don't read other fields. qatomic_read is 
enough.


  g_free(call_state);
  }
  
  bool block_copy_call_finished(BlockCopyCallState *call_state)

  {
-return call_state->finished;
+return qatomic_load_acquire(&call_state->finished);


and here


  }
  
  bool block_copy_call_succeeded(BlockCopyCallState *call_state)

  {
-return call_state->finished && !call_state->cancelled &&
-call_state->ret == 0;
+return qatomic_load_acquire(&call_state->finished) &&
+   !qatomic_read(&call_state->cancelled) &&
+   call_state->ret == 0;


Here qatomic_load_acquire() is reasonable: it protects read of ->ret


  }
  
  bool block_copy_call_failed(BlockCopyCallState *call_state)

  {
-return call_state->finished && !call_state->cancelled &&
-call_state->ret < 0;
+return qatomic_load_acquire(&call_state->finished) &&
+   !qatomic_read(&call_state->cancelled) &&
+   call_state->ret < 0;


Here reasonable.


  }
  
  bool block_copy_call_cancelled(BlockCopyCallState *call_state)

  {
-return call_state->cancelled;
+return