On Fri, Dec 02, 2016 at 02:04:00PM +0000, 'Viktor Bachraty' via ganeti-devel 
wrote:
> In case of Xen migrations, the most common failure case is when the
> instance fails to freeze so the migration fails with domains running on
> both target and source node. This patchs allows migrate --cleanup to
> recover by running AbortMigrate() in case the instance is running on
> both the source and target node.
> 
> Signed-off-by: Viktor Bachraty <[email protected]>

Mostly LGTM. See below:

Thanks,
Brian.

> ---
>  lib/cmdlib/instance_migration.py | 99 
> ++++++++++++++++++++++++++++++----------
>  1 file changed, 74 insertions(+), 25 deletions(-)
> 
> +      result.Raise("Can't contact node %s" % self.cfg.GetNodeName(node_uuid))
> +
> +    # Xen renames the instance during migration, unfortunately we don't have
> +    # a nicer way of identifying that it's the same instance. This is an 
> awful
> +    # leaking abstraction.

Could we add a little more documentation than this to make life easier on
future ganeti devs? Eg.

# xm and xl have different (undocumented) naming conventions
# xm: (in tools/python/xen/xend/XendCheckpoint.py save() & restore())
#                   source dom name    target dom name
# during copy:      migrating-$DOM     $DOM
# finalize migrate: <none>             $DOM
# finished:         <none>             $DOM
#
# xl: (in tools/libxl/xl_cmdimpl.c migrate_domain() & migrate_receive())
#                   source dom name    target dom name
# during copy:      $DOM               $DOM--incoming
# finalize migrate: $DOM--migratedaway $DOM
# finished:         <none>             $DOM

> +    variants = [
> +        name, 'migrating-' + name, name + '--incoming', name + 
> '--migratedaway']
> +    node_uuids = [node for node, data in instance_list.items()
> +                  if any(var in data.payload for var in variants)]
> +    self.feedback_fn("* instance running on: %s" % ','.join(
> +        self.cfg.GetNodeName(uuid) for uuid in node_uuids))
> +    return node_uuids

Reply via email to