Re: [Qemu-devel] [PATCH for-4.0 v8 3/7] migration: fix the multifd code when receiving less channels

2018-12-19 Thread Fei Li




On 12/19/2018 10:11 PM, Markus Armbruster wrote:

Fei Li  writes:


On 12/13/2018 02:17 PM, Markus Armbruster wrote:

Fei Li  writes:


In our current code, when multifd is used during migration, if there
is an error before the destination receives all new channels, the
source keeps running, however the destination does not exit but keeps
waiting until the source is killed deliberately.

Fix this by dumping the specific error and let users decide whether
to quit from the destination side when failing to receive packet via
some channel.

Cc: Dr. David Alan Gilbert 
Signed-off-by: Fei Li 
Reviewed-by: Peter Xu 

[...]

diff --git a/migration/migration.h b/migration/migration.h
index e413d4d8b6..02b7304610 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -229,7 +229,7 @@ struct MigrationState
   void migrate_set_state(int *state, int old_state, int new_state);
 void migration_fd_process_incoming(QEMUFile *f);
-void migration_ioc_process_incoming(QIOChannel *ioc);
+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
   void migration_incoming_process(void);
 bool  migration_has_all_channels(void);
diff --git a/migration/ram.c b/migration/ram.c
index 7e7deec4d8..c7e3d6b0fd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1323,7 +1323,7 @@ bool multifd_recv_all_channels_created(void)
   }
 /* Return true if multifd is ready for the migration, otherwise
false */
-bool multifd_recv_new_channel(QIOChannel *ioc)
+bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
   {
   MultiFDRecvParams *p;
   Error *local_err = NULL;
@@ -1331,6 +1331,10 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
 id = multifd_recv_initial_packet(ioc, _err);
   if (id < 0) {
+error_propagate_prepend(errp, local_err,
+"failed to receive packet"
+" via multifd channel %d: ",
+atomic_read(_recv_state->count));
   multifd_recv_terminate_threads(local_err);
   return false;

Here, we return false without setting an error.

I am not sure whether I understand correctly, but here I think the above
error_propagate_prepend() set the error to errp.

You're right, I got confused.

However, you shouldn't access @local_err after error_propagate() or
similar.  Please insert error_propagate_prepend() after
multifd_recv_terminate_threads(), lik you do in the next hunk.

Right, thanks for the reminder.

Have a nice day :)
Fei



   }
@@ -1340,6 +1344,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
   error_setg(_err, "multifd: received id '%d' already setup'",
  id);
   multifd_recv_terminate_threads(local_err);
+error_propagate(errp, local_err);
   return false;

Here, we return false with setting an error.


   }
   p->c = ioc;
@@ -1351,7 +1356,8 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
   qemu_thread_create(>thread, p->name, multifd_recv_thread, p,
  QEMU_THREAD_JOINABLE);
   atomic_inc(_recv_state->count);
-return multifd_recv_state->count == migrate_multifd_channels();
+return atomic_read(_recv_state->count) ==
+   migrate_multifd_channels();

Here, we return either true of false without setting an error.

yes.

Taken together, there are three cases:

1. Succeed and return true

Yes, when all multifd channels are correctly received.

2. Succeed and return false

Yes, when the current multifd channel is received correctly, but
have not received all the channels.

Aha.


3. Fail (set an error) and return false.

Yes. And with the propagated error, the code just returns and
report the error in migration_channel_process_incoming().

Assuming that's what we want: please update the function comment to
spell them out.

Ok, I will update the three cases in the comment to clarify in detail.

Have a nice day, thanks :)

You're welcome!







Re: [Qemu-devel] [PATCH for-4.0 v8 3/7] migration: fix the multifd code when receiving less channels

2018-12-19 Thread Markus Armbruster
Fei Li  writes:

> On 12/13/2018 02:17 PM, Markus Armbruster wrote:
>> Fei Li  writes:
>>
>>> In our current code, when multifd is used during migration, if there
>>> is an error before the destination receives all new channels, the
>>> source keeps running, however the destination does not exit but keeps
>>> waiting until the source is killed deliberately.
>>>
>>> Fix this by dumping the specific error and let users decide whether
>>> to quit from the destination side when failing to receive packet via
>>> some channel.
>>>
>>> Cc: Dr. David Alan Gilbert 
>>> Signed-off-by: Fei Li 
>>> Reviewed-by: Peter Xu 
[...]
>>> diff --git a/migration/migration.h b/migration/migration.h
>>> index e413d4d8b6..02b7304610 100644
>>> --- a/migration/migration.h
>>> +++ b/migration/migration.h
>>> @@ -229,7 +229,7 @@ struct MigrationState
>>>   void migrate_set_state(int *state, int old_state, int new_state);
>>> void migration_fd_process_incoming(QEMUFile *f);
>>> -void migration_ioc_process_incoming(QIOChannel *ioc);
>>> +void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
>>>   void migration_incoming_process(void);
>>> bool  migration_has_all_channels(void);
>>> diff --git a/migration/ram.c b/migration/ram.c
>>> index 7e7deec4d8..c7e3d6b0fd 100644
>>> --- a/migration/ram.c
>>> +++ b/migration/ram.c
>>> @@ -1323,7 +1323,7 @@ bool multifd_recv_all_channels_created(void)
>>>   }
>>> /* Return true if multifd is ready for the migration, otherwise
>>> false */
>>> -bool multifd_recv_new_channel(QIOChannel *ioc)
>>> +bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
>>>   {
>>>   MultiFDRecvParams *p;
>>>   Error *local_err = NULL;
>>> @@ -1331,6 +1331,10 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
>>> id = multifd_recv_initial_packet(ioc, _err);
>>>   if (id < 0) {
>>> +error_propagate_prepend(errp, local_err,
>>> +"failed to receive packet"
>>> +" via multifd channel %d: ",
>>> +atomic_read(_recv_state->count));
>>>   multifd_recv_terminate_threads(local_err);
>>>   return false;
>> Here, we return false without setting an error.
> I am not sure whether I understand correctly, but here I think the above
> error_propagate_prepend() set the error to errp.

You're right, I got confused.

However, you shouldn't access @local_err after error_propagate() or
similar.  Please insert error_propagate_prepend() after
multifd_recv_terminate_threads(), lik you do in the next hunk.

>>>   }
>>> @@ -1340,6 +1344,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
>>>   error_setg(_err, "multifd: received id '%d' already setup'",
>>>  id);
>>>   multifd_recv_terminate_threads(local_err);
>>> +error_propagate(errp, local_err);
>>>   return false;
>> Here, we return false with setting an error.
>>
>>>   }
>>>   p->c = ioc;
>>> @@ -1351,7 +1356,8 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
>>>   qemu_thread_create(>thread, p->name, multifd_recv_thread, p,
>>>  QEMU_THREAD_JOINABLE);
>>>   atomic_inc(_recv_state->count);
>>> -return multifd_recv_state->count == migrate_multifd_channels();
>>> +return atomic_read(_recv_state->count) ==
>>> +   migrate_multifd_channels();
>> Here, we return either true of false without setting an error.
> yes.
>> Taken together, there are three cases:
>>
>> 1. Succeed and return true
> Yes, when all multifd channels are correctly received.
>> 2. Succeed and return false
> Yes, when the current multifd channel is received correctly, but
> have not received all the channels.

Aha.

>> 3. Fail (set an error) and return false.
> Yes. And with the propagated error, the code just returns and
> report the error in migration_channel_process_incoming().
>> Assuming that's what we want: please update the function comment to
>> spell them out.
> Ok, I will update the three cases in the comment to clarify in detail.
>
> Have a nice day, thanks :)

You're welcome!



Re: [Qemu-devel] [PATCH for-4.0 v8 3/7] migration: fix the multifd code when receiving less channels

2018-12-17 Thread Fei Li




On 12/13/2018 02:17 PM, Markus Armbruster wrote:

Fei Li  writes:


In our current code, when multifd is used during migration, if there
is an error before the destination receives all new channels, the
source keeps running, however the destination does not exit but keeps
waiting until the source is killed deliberately.

Fix this by dumping the specific error and let users decide whether
to quit from the destination side when failing to receive packet via
some channel.

Cc: Dr. David Alan Gilbert 
Signed-off-by: Fei Li 
Reviewed-by: Peter Xu 
---
  migration/channel.c   | 11 ++-
  migration/migration.c |  9 +++--
  migration/migration.h |  2 +-
  migration/ram.c   | 10 --
  migration/ram.h   |  2 +-
  5 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 33e0e9b82f..20e4c8e2dc 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -30,6 +30,7 @@
  void migration_channel_process_incoming(QIOChannel *ioc)
  {
  MigrationState *s = migrate_get_current();
+Error *local_err = NULL;
  
  trace_migration_set_incoming_channel(

  ioc, object_get_typename(OBJECT(ioc)));
@@ -38,13 +39,13 @@ void migration_channel_process_incoming(QIOChannel *ioc)
  *s->parameters.tls_creds &&
  !object_dynamic_cast(OBJECT(ioc),
   TYPE_QIO_CHANNEL_TLS)) {
-Error *local_err = NULL;
  migration_tls_channel_process_incoming(s, ioc, _err);
-if (local_err) {
-error_report_err(local_err);
-}
  } else {
-migration_ioc_process_incoming(ioc);
+migration_ioc_process_incoming(ioc, _err);
+}
+
+if (local_err) {
+error_report_err(local_err);
  }
  }
  
diff --git a/migration/migration.c b/migration/migration.c

index 49ffb9997a..72106bddf0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -541,7 +541,7 @@ void migration_fd_process_incoming(QEMUFile *f)
  migration_incoming_process();
  }
  
-void migration_ioc_process_incoming(QIOChannel *ioc)

+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
  {
  MigrationIncomingState *mis = migration_incoming_get_current();
  bool start_migration;
@@ -563,9 +563,14 @@ void migration_ioc_process_incoming(QIOChannel *ioc)
   */
  start_migration = !migrate_use_multifd();
  } else {
+Error *local_err = NULL;
  /* Multiple connections */
  assert(migrate_use_multifd());
-start_migration = multifd_recv_new_channel(ioc);
+start_migration = multifd_recv_new_channel(ioc, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
  }
  
  if (start_migration) {

diff --git a/migration/migration.h b/migration/migration.h
index e413d4d8b6..02b7304610 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -229,7 +229,7 @@ struct MigrationState
  void migrate_set_state(int *state, int old_state, int new_state);
  
  void migration_fd_process_incoming(QEMUFile *f);

-void migration_ioc_process_incoming(QIOChannel *ioc);
+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
  void migration_incoming_process(void);
  
  bool  migration_has_all_channels(void);

diff --git a/migration/ram.c b/migration/ram.c
index 7e7deec4d8..c7e3d6b0fd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1323,7 +1323,7 @@ bool multifd_recv_all_channels_created(void)
  }
  
  /* Return true if multifd is ready for the migration, otherwise false */

-bool multifd_recv_new_channel(QIOChannel *ioc)
+bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
  {
  MultiFDRecvParams *p;
  Error *local_err = NULL;
@@ -1331,6 +1331,10 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
  
  id = multifd_recv_initial_packet(ioc, _err);

  if (id < 0) {
+error_propagate_prepend(errp, local_err,
+"failed to receive packet"
+" via multifd channel %d: ",
+atomic_read(_recv_state->count));
  multifd_recv_terminate_threads(local_err);
  return false;

Here, we return false without setting an error.

I am not sure whether I understand correctly, but here I think the above
error_propagate_prepend() set the error to errp.

  }
@@ -1340,6 +1344,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
  error_setg(_err, "multifd: received id '%d' already setup'",
 id);
  multifd_recv_terminate_threads(local_err);
+error_propagate(errp, local_err);
  return false;

Here, we return false with setting an error.


  }
  p->c = ioc;
@@ -1351,7 +1356,8 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
  qemu_thread_create(>thread, p->name, multifd_recv_thread, p,
 QEMU_THREAD_JOINABLE);
  

Re: [Qemu-devel] [PATCH for-4.0 v8 3/7] migration: fix the multifd code when receiving less channels

2018-12-12 Thread Markus Armbruster
Fei Li  writes:

> In our current code, when multifd is used during migration, if there
> is an error before the destination receives all new channels, the
> source keeps running, however the destination does not exit but keeps
> waiting until the source is killed deliberately.
>
> Fix this by dumping the specific error and let users decide whether
> to quit from the destination side when failing to receive packet via
> some channel.
>
> Cc: Dr. David Alan Gilbert 
> Signed-off-by: Fei Li 
> Reviewed-by: Peter Xu 
> ---
>  migration/channel.c   | 11 ++-
>  migration/migration.c |  9 +++--
>  migration/migration.h |  2 +-
>  migration/ram.c   | 10 --
>  migration/ram.h   |  2 +-
>  5 files changed, 23 insertions(+), 11 deletions(-)
>
> diff --git a/migration/channel.c b/migration/channel.c
> index 33e0e9b82f..20e4c8e2dc 100644
> --- a/migration/channel.c
> +++ b/migration/channel.c
> @@ -30,6 +30,7 @@
>  void migration_channel_process_incoming(QIOChannel *ioc)
>  {
>  MigrationState *s = migrate_get_current();
> +Error *local_err = NULL;
>  
>  trace_migration_set_incoming_channel(
>  ioc, object_get_typename(OBJECT(ioc)));
> @@ -38,13 +39,13 @@ void migration_channel_process_incoming(QIOChannel *ioc)
>  *s->parameters.tls_creds &&
>  !object_dynamic_cast(OBJECT(ioc),
>   TYPE_QIO_CHANNEL_TLS)) {
> -Error *local_err = NULL;
>  migration_tls_channel_process_incoming(s, ioc, _err);
> -if (local_err) {
> -error_report_err(local_err);
> -}
>  } else {
> -migration_ioc_process_incoming(ioc);
> +migration_ioc_process_incoming(ioc, _err);
> +}
> +
> +if (local_err) {
> +error_report_err(local_err);
>  }
>  }
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index 49ffb9997a..72106bddf0 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -541,7 +541,7 @@ void migration_fd_process_incoming(QEMUFile *f)
>  migration_incoming_process();
>  }
>  
> -void migration_ioc_process_incoming(QIOChannel *ioc)
> +void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>  {
>  MigrationIncomingState *mis = migration_incoming_get_current();
>  bool start_migration;
> @@ -563,9 +563,14 @@ void migration_ioc_process_incoming(QIOChannel *ioc)
>   */
>  start_migration = !migrate_use_multifd();
>  } else {
> +Error *local_err = NULL;
>  /* Multiple connections */
>  assert(migrate_use_multifd());
> -start_migration = multifd_recv_new_channel(ioc);
> +start_migration = multifd_recv_new_channel(ioc, _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return;
> +}
>  }
>  
>  if (start_migration) {
> diff --git a/migration/migration.h b/migration/migration.h
> index e413d4d8b6..02b7304610 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -229,7 +229,7 @@ struct MigrationState
>  void migrate_set_state(int *state, int old_state, int new_state);
>  
>  void migration_fd_process_incoming(QEMUFile *f);
> -void migration_ioc_process_incoming(QIOChannel *ioc);
> +void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
>  void migration_incoming_process(void);
>  
>  bool  migration_has_all_channels(void);
> diff --git a/migration/ram.c b/migration/ram.c
> index 7e7deec4d8..c7e3d6b0fd 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1323,7 +1323,7 @@ bool multifd_recv_all_channels_created(void)
>  }
>  
>  /* Return true if multifd is ready for the migration, otherwise false */
> -bool multifd_recv_new_channel(QIOChannel *ioc)
> +bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
>  {
>  MultiFDRecvParams *p;
>  Error *local_err = NULL;
> @@ -1331,6 +1331,10 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
>  
>  id = multifd_recv_initial_packet(ioc, _err);
>  if (id < 0) {
> +error_propagate_prepend(errp, local_err,
> +"failed to receive packet"
> +" via multifd channel %d: ",
> +atomic_read(_recv_state->count));
>  multifd_recv_terminate_threads(local_err);
>  return false;

Here, we return false without setting an error.

>  }
> @@ -1340,6 +1344,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
>  error_setg(_err, "multifd: received id '%d' already setup'",
> id);
>  multifd_recv_terminate_threads(local_err);
> +error_propagate(errp, local_err);
>  return false;

Here, we return false with setting an error.

>  }
>  p->c = ioc;
> @@ -1351,7 +1356,8 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
>  qemu_thread_create(>thread, p->name, multifd_recv_thread, p,
> QEMU_THREAD_JOINABLE);
>  

[Qemu-devel] [PATCH for-4.0 v8 3/7] migration: fix the multifd code when receiving less channels

2018-12-11 Thread Fei Li
In our current code, when multifd is used during migration, if there
is an error before the destination receives all new channels, the
source keeps running, however the destination does not exit but keeps
waiting until the source is killed deliberately.

Fix this by dumping the specific error and let users decide whether
to quit from the destination side when failing to receive packet via
some channel.

Cc: Dr. David Alan Gilbert 
Signed-off-by: Fei Li 
Reviewed-by: Peter Xu 
---
 migration/channel.c   | 11 ++-
 migration/migration.c |  9 +++--
 migration/migration.h |  2 +-
 migration/ram.c   | 10 --
 migration/ram.h   |  2 +-
 5 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 33e0e9b82f..20e4c8e2dc 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -30,6 +30,7 @@
 void migration_channel_process_incoming(QIOChannel *ioc)
 {
 MigrationState *s = migrate_get_current();
+Error *local_err = NULL;
 
 trace_migration_set_incoming_channel(
 ioc, object_get_typename(OBJECT(ioc)));
@@ -38,13 +39,13 @@ void migration_channel_process_incoming(QIOChannel *ioc)
 *s->parameters.tls_creds &&
 !object_dynamic_cast(OBJECT(ioc),
  TYPE_QIO_CHANNEL_TLS)) {
-Error *local_err = NULL;
 migration_tls_channel_process_incoming(s, ioc, _err);
-if (local_err) {
-error_report_err(local_err);
-}
 } else {
-migration_ioc_process_incoming(ioc);
+migration_ioc_process_incoming(ioc, _err);
+}
+
+if (local_err) {
+error_report_err(local_err);
 }
 }
 
diff --git a/migration/migration.c b/migration/migration.c
index 49ffb9997a..72106bddf0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -541,7 +541,7 @@ void migration_fd_process_incoming(QEMUFile *f)
 migration_incoming_process();
 }
 
-void migration_ioc_process_incoming(QIOChannel *ioc)
+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
 {
 MigrationIncomingState *mis = migration_incoming_get_current();
 bool start_migration;
@@ -563,9 +563,14 @@ void migration_ioc_process_incoming(QIOChannel *ioc)
  */
 start_migration = !migrate_use_multifd();
 } else {
+Error *local_err = NULL;
 /* Multiple connections */
 assert(migrate_use_multifd());
-start_migration = multifd_recv_new_channel(ioc);
+start_migration = multifd_recv_new_channel(ioc, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
 }
 
 if (start_migration) {
diff --git a/migration/migration.h b/migration/migration.h
index e413d4d8b6..02b7304610 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -229,7 +229,7 @@ struct MigrationState
 void migrate_set_state(int *state, int old_state, int new_state);
 
 void migration_fd_process_incoming(QEMUFile *f);
-void migration_ioc_process_incoming(QIOChannel *ioc);
+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
 void migration_incoming_process(void);
 
 bool  migration_has_all_channels(void);
diff --git a/migration/ram.c b/migration/ram.c
index 7e7deec4d8..c7e3d6b0fd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1323,7 +1323,7 @@ bool multifd_recv_all_channels_created(void)
 }
 
 /* Return true if multifd is ready for the migration, otherwise false */
-bool multifd_recv_new_channel(QIOChannel *ioc)
+bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
 {
 MultiFDRecvParams *p;
 Error *local_err = NULL;
@@ -1331,6 +1331,10 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
 
 id = multifd_recv_initial_packet(ioc, _err);
 if (id < 0) {
+error_propagate_prepend(errp, local_err,
+"failed to receive packet"
+" via multifd channel %d: ",
+atomic_read(_recv_state->count));
 multifd_recv_terminate_threads(local_err);
 return false;
 }
@@ -1340,6 +1344,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
 error_setg(_err, "multifd: received id '%d' already setup'",
id);
 multifd_recv_terminate_threads(local_err);
+error_propagate(errp, local_err);
 return false;
 }
 p->c = ioc;
@@ -1351,7 +1356,8 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
 qemu_thread_create(>thread, p->name, multifd_recv_thread, p,
QEMU_THREAD_JOINABLE);
 atomic_inc(_recv_state->count);
-return multifd_recv_state->count == migrate_multifd_channels();
+return atomic_read(_recv_state->count) ==
+   migrate_multifd_channels();
 }
 
 /**
diff --git a/migration/ram.h b/migration/ram.h
index 83ff1bc11a..046d3074be 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -47,7 +47,7 @@ int multifd_save_cleanup(Error **errp);