Re: [ClusterLabs] Gracefully Failing Live Migrations

2024-02-01 Thread Ken Gaillot
On Thu, 2024-02-01 at 12:57 -0600, Billy Croan wrote:
> How do I figure out which of the three steps failed and why?

They're normal resource actions: migrate_to, migrate_from, and stop.
You can investigate them in the usual way (status, logs).

> 
> On Thu, Feb 1, 2024 at 11:15 AM Ken Gaillot 
> wrote:
> > On Thu, 2024-02-01 at 10:20 -0600, Billy Croan wrote:
> > > Sometimes I've tried to move a resource from one node to another,
> > and
> > > it migrates live without a problem.  Other times I get 
> > > > Failed Resource Actions:
> > > > * vm_myvm_migrate_to_0 on node1 'unknown error' (1): call=102,
> > > > status=complete, exitreason='myvm: live migration to node2
> > failed:
> > > > 1',
> > > > last-rc-change='Sat Jan 13 09:13:31 2024', queued=1ms,
> > > > exec=35874ms
> > > > 
> > > 
> > > And I find out the live part of the migration failed, when the vm
> > > reboots and an (albeit minor) outage occurs.
> > > 
> > > Is there a way to configure pacemaker, so that if it is unable to
> > > migrate live it simply does not migrate at all?
> > > 
> > 
> > No. Pacemaker automatically replaces a required stop/start sequence
> > with live migration when possible. If there is a live migration
> > attempted, by definition the resource must move one way or another.
> > Also, live migration involves three steps, and if one of them
> > fails,
> > the resource is in an unknown state, so it must be restarted
> > anyway.
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Gracefully Failing Live Migrations

2024-02-01 Thread Billy Croan
How do I figure out which of the three steps failed and why?

On Thu, Feb 1, 2024 at 11:15 AM Ken Gaillot  wrote:

> On Thu, 2024-02-01 at 10:20 -0600, Billy Croan wrote:
> > Sometimes I've tried to move a resource from one node to another, and
> > it migrates live without a problem.  Other times I get
> > > Failed Resource Actions:
> > > * vm_myvm_migrate_to_0 on node1 'unknown error' (1): call=102,
> > > status=complete, exitreason='myvm: live migration to node2 failed:
> > > 1',
> > > last-rc-change='Sat Jan 13 09:13:31 2024', queued=1ms,
> > > exec=35874ms
> > >
> >
> > And I find out the live part of the migration failed, when the vm
> > reboots and an (albeit minor) outage occurs.
> >
> > Is there a way to configure pacemaker, so that if it is unable to
> > migrate live it simply does not migrate at all?
> >
>
> No. Pacemaker automatically replaces a required stop/start sequence
> with live migration when possible. If there is a live migration
> attempted, by definition the resource must move one way or another.
> Also, live migration involves three steps, and if one of them fails,
> the resource is in an unknown state, so it must be restarted anyway.
> --
> Ken Gaillot 
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Gracefully Failing Live Migrations

2024-02-01 Thread Ken Gaillot
On Thu, 2024-02-01 at 10:20 -0600, Billy Croan wrote:
> Sometimes I've tried to move a resource from one node to another, and
> it migrates live without a problem.  Other times I get 
> > Failed Resource Actions:
> > * vm_myvm_migrate_to_0 on node1 'unknown error' (1): call=102,
> > status=complete, exitreason='myvm: live migration to node2 failed:
> > 1',
> > last-rc-change='Sat Jan 13 09:13:31 2024', queued=1ms,
> > exec=35874ms
> > 
> 
> And I find out the live part of the migration failed, when the vm
> reboots and an (albeit minor) outage occurs.
> 
> Is there a way to configure pacemaker, so that if it is unable to
> migrate live it simply does not migrate at all?
> 

No. Pacemaker automatically replaces a required stop/start sequence
with live migration when possible. If there is a live migration
attempted, by definition the resource must move one way or another.
Also, live migration involves three steps, and if one of them fails,
the resource is in an unknown state, so it must be restarted anyway.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Gracefully Failing Live Migrations

2024-02-01 Thread Billy Croan
Sometimes I've tried to move a resource from one node to another, and it
migrates live without a problem.  Other times I get

> Failed Resource Actions:
> * vm_myvm_migrate_to_0 on node1 'unknown error' (1): call=102,
> status=complete, exitreason='myvm: live migration to node2 failed: 1',
> last-rc-change='Sat Jan 13 09:13:31 2024', queued=1ms, exec=35874ms
>

And I find out the live part of the migration failed, when the vm reboots
and an (albeit minor) outage occurs.

Is there a way to configure pacemaker, so that if it is unable to migrate
live it simply does not migrate at all?
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/