Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-19 Thread lejeczek via Users




On 19/04/2023 16:16, Ken Gaillot wrote:

On Wed, 2023-04-19 at 08:00 +0200, lejeczek via Users wrote:

On 18/04/2023 21:02, Ken Gaillot wrote:

On Tue, 2023-04-18 at 19:36 +0200, lejeczek via Users wrote:

On 18/04/2023 18:22, Ken Gaillot wrote:

On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:

Hi guys.

When it's done by the cluster itself, eg. a node goes
'standby' -
how
do clusters migrate VirtualDomain resources?

1. Call resource agent migrate_to action on original node
2. Call resource agent migrate_from action on new node
3. Call resource agent stop action on original node


Do users have any control over it and if so then how?

The allow-migrate resource meta-attribute (true/false)


I'd imagine there must be some docs - I failed to find

It's sort of scattered throughout Pacemaker Explained -- the
main
one
is:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources


Especially in large deployments one obvious question would be
-
I'm
guessing as my setup is rather SOHO - can VMs migrate in
sequence
or
it is(always?) a kind of 'swarm' migration?

The migration-limit cluster property specifies how many live
migrations
may be initiated at once (the default of -1 means unlimited).

But if this is cluster property - unless I got it wrong,
hopefully - then this govern any/all resources.
If so, can such a limit be rounded down to RA type or
perhaps group of resources?

many thanks, L.

No, it's global

To me it feels so intuitive, so natural & obvious that I
will ask - nobody yet suggested that such feature be
available to smaller divisions of cluster independently of
global rule?
In the vastness of resource types many are polar opposites
and to treat them all the same?
Would be great to have some way to tell cluster to run
different migration/relocation limits on for eg.
compute-heavy resources VS light-weight ones - where to
"file" such a enhancement suggestion, Bugzilla?

many thanks, L.

Looking at the code, I see it's a little different than I originally
thought.

First, I overlooked that it's correctly documented as a per-node limit
rather than a cluster-wide limit.

That highlights the complexity of allowing different values for
different resources; if rscA has a migration limit of 2, and rscB has a
migration limit of 5, do we allow up to 2 rscA migrations and 5 rscB
migrations simultaneously, or do we weight them relative to each other
so the total capacity is still constrained (for example limiting it to
1 rscA migration and 2 rscB migrations together)?
My first thoughts were - I cannot comment on the code, only 
inasmuch as an admin would care - perhaps to introduce, if 
would not require business logic total overhaul, "migration 
groups"(while not being another resource type) whose such 
groups then a resource could be member.
Or perhaps marry 'migration-limit' to 'resource group' which 
would take priority over global/node-wide rule.
One way or another, simple to end-users - then user/admin 
sets N-limit of resources which in such group can be 
live-migrated at one time, say...


in this-given-group only 2 resources can cluster attempt to 
live-migrate simultaneously, then wait for success or 
failure but wait for result and only then proceed to next & ...




We would almost need something like the node utilization feature, being
able to define a node's total migration capacity and then how much of
that capacity is taken up by the migration of a specific resource. That
seems overcomplicated to me, especially since there aren't that many
resource types that support live migration.
Those types which do support live migration and are 
compute-heavy, then I really wonder how large consumers do 
VirtualDomain migration, as one good example.
Say a Virtual/Cloud provides - there a chunky host node 
might host hundreds VMs - there, but anywhere else timeouts, 
all/any, must be some real, fixed number.
As of right now, how intuitive is what cluster does when it 
swarms - say equally - those hundreds of VMs to 
remaining-available nodes...
... even with fast inner-node connectivity many - without 
migration-limit - live-migrations will timeout.
Is cluster capable of some very clever heuristics so humans 
could leave it to the machine to ensure that such 
mass-migration will not fail simply due to overall 
bottleneck of the underlying infrastructure?
... and could the cluster alone do that? Would not 
VirtualDomain agent have to gather comprehensive metric data 
on each VM in the first place, to feed it to the cluster 
internal logic..?
I would see some way similar to these which I mentioned 
above, as relatively effective and surely down-to-earth, 
practical aid to alleviate cases such as VMs "mass-migration".




Second, any actions on a Pacemaker Remote node count toward the
throttling limit of its connection host, and aren't checked for
migration-limit at all. That's an interesting design choice, and it's
not clear what the 

Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-19 Thread Ken Gaillot
On Wed, 2023-04-19 at 08:00 +0200, lejeczek via Users wrote:
> 
> On 18/04/2023 21:02, Ken Gaillot wrote:
> > On Tue, 2023-04-18 at 19:36 +0200, lejeczek via Users wrote:
> > > On 18/04/2023 18:22, Ken Gaillot wrote:
> > > > On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:
> > > > > Hi guys.
> > > > > 
> > > > > When it's done by the cluster itself, eg. a node goes
> > > > > 'standby' -
> > > > > how
> > > > > do clusters migrate VirtualDomain resources?
> > > > 1. Call resource agent migrate_to action on original node
> > > > 2. Call resource agent migrate_from action on new node
> > > > 3. Call resource agent stop action on original node
> > > > 
> > > > > Do users have any control over it and if so then how?
> > > > The allow-migrate resource meta-attribute (true/false)
> > > > 
> > > > > I'd imagine there must be some docs - I failed to find
> > > > It's sort of scattered throughout Pacemaker Explained -- the
> > > > main
> > > > one
> > > > is:
> > > > 
> > > > https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources
> > > > 
> > > > > Especially in large deployments one obvious question would be
> > > > > -
> > > > > I'm
> > > > > guessing as my setup is rather SOHO - can VMs migrate in
> > > > > sequence
> > > > > or
> > > > > it is(always?) a kind of 'swarm' migration?
> > > > The migration-limit cluster property specifies how many live
> > > > migrations
> > > > may be initiated at once (the default of -1 means unlimited).
> > > But if this is cluster property - unless I got it wrong,
> > > hopefully - then this govern any/all resources.
> > > If so, can such a limit be rounded down to RA type or
> > > perhaps group of resources?
> > > 
> > > many thanks, L.
> > No, it's global
> To me it feels so intuitive, so natural & obvious that I 
> will ask - nobody yet suggested that such feature be 
> available to smaller divisions of cluster independently of 
> global rule?
> In the vastness of resource types many are polar opposites 
> and to treat them all the same?
> Would be great to have some way to tell cluster to run 
> different migration/relocation limits on for eg. 
> compute-heavy resources VS light-weight ones - where to 
> "file" such a enhancement suggestion, Bugzilla?
> 
> many thanks, L.

Looking at the code, I see it's a little different than I originally
thought.

First, I overlooked that it's correctly documented as a per-node limit
rather than a cluster-wide limit.

That highlights the complexity of allowing different values for
different resources; if rscA has a migration limit of 2, and rscB has a
migration limit of 5, do we allow up to 2 rscA migrations and 5 rscB
migrations simultaneously, or do we weight them relative to each other
so the total capacity is still constrained (for example limiting it to
1 rscA migration and 2 rscB migrations together)?

We would almost need something like the node utilization feature, being
able to define a node's total migration capacity and then how much of
that capacity is taken up by the migration of a specific resource. That
seems overcomplicated to me, especially since there aren't that many
resource types that support live migration.

Second, any actions on a Pacemaker Remote node count toward the
throttling limit of its connection host, and aren't checked for
migration-limit at all. That's an interesting design choice, and it's
not clear what the ideal would be. For a VM or container, it kind of
makes sense to count against the host's throttling. For a remote node,
not so much. And I'm guessing not checking migration-limit in this case
is an oversight.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-19 Thread lejeczek via Users




On 18/04/2023 21:02, Ken Gaillot wrote:

On Tue, 2023-04-18 at 19:36 +0200, lejeczek via Users wrote:

On 18/04/2023 18:22, Ken Gaillot wrote:

On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:

Hi guys.

When it's done by the cluster itself, eg. a node goes 'standby' -
how
do clusters migrate VirtualDomain resources?

1. Call resource agent migrate_to action on original node
2. Call resource agent migrate_from action on new node
3. Call resource agent stop action on original node


Do users have any control over it and if so then how?

The allow-migrate resource meta-attribute (true/false)


I'd imagine there must be some docs - I failed to find

It's sort of scattered throughout Pacemaker Explained -- the main
one
is:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources


Especially in large deployments one obvious question would be -
I'm
guessing as my setup is rather SOHO - can VMs migrate in sequence
or
it is(always?) a kind of 'swarm' migration?

The migration-limit cluster property specifies how many live
migrations
may be initiated at once (the default of -1 means unlimited).

But if this is cluster property - unless I got it wrong,
hopefully - then this govern any/all resources.
If so, can such a limit be rounded down to RA type or
perhaps group of resources?

many thanks, L.

No, it's global
To me it feels so intuitive, so natural & obvious that I 
will ask - nobody yet suggested that such feature be 
available to smaller divisions of cluster independently of 
global rule?
In the vastness of resource types many are polar opposites 
and to treat them all the same?
Would be great to have some way to tell cluster to run 
different migration/relocation limits on for eg. 
compute-heavy resources VS light-weight ones - where to 
"file" such a enhancement suggestion, Bugzilla?


many thanks, L.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-18 Thread Ken Gaillot
On Tue, 2023-04-18 at 19:36 +0200, lejeczek via Users wrote:
> 
> On 18/04/2023 18:22, Ken Gaillot wrote:
> > On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:
> > > Hi guys.
> > > 
> > > When it's done by the cluster itself, eg. a node goes 'standby' -
> > > how
> > > do clusters migrate VirtualDomain resources?
> > 1. Call resource agent migrate_to action on original node
> > 2. Call resource agent migrate_from action on new node
> > 3. Call resource agent stop action on original node
> > 
> > > Do users have any control over it and if so then how?
> > The allow-migrate resource meta-attribute (true/false)
> > 
> > > I'd imagine there must be some docs - I failed to find
> > It's sort of scattered throughout Pacemaker Explained -- the main
> > one
> > is:
> > 
> > https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources
> > 
> > > Especially in large deployments one obvious question would be -
> > > I'm
> > > guessing as my setup is rather SOHO - can VMs migrate in sequence
> > > or
> > > it is(always?) a kind of 'swarm' migration?
> > The migration-limit cluster property specifies how many live
> > migrations
> > may be initiated at once (the default of -1 means unlimited).
> But if this is cluster property - unless I got it wrong, 
> hopefully - then this govern any/all resources.
> If so, can such a limit be rounded down to RA type or 
> perhaps group of resources?
> 
> many thanks, L.

No, it's global
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-18 Thread lejeczek via Users




On 18/04/2023 18:22, Ken Gaillot wrote:

On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:

Hi guys.

When it's done by the cluster itself, eg. a node goes 'standby' - how
do clusters migrate VirtualDomain resources?

1. Call resource agent migrate_to action on original node
2. Call resource agent migrate_from action on new node
3. Call resource agent stop action on original node


Do users have any control over it and if so then how?

The allow-migrate resource meta-attribute (true/false)


I'd imagine there must be some docs - I failed to find

It's sort of scattered throughout Pacemaker Explained -- the main one
is:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources


Especially in large deployments one obvious question would be - I'm
guessing as my setup is rather SOHO - can VMs migrate in sequence or
it is(always?) a kind of 'swarm' migration?

The migration-limit cluster property specifies how many live migrations
may be initiated at once (the default of -1 means unlimited).
But if this is cluster property - unless I got it wrong, 
hopefully - then this govern any/all resources.
If so, can such a limit be rounded down to RA type or 
perhaps group of resources?


many thanks, L.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-18 Thread Ken Gaillot
On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:
> Hi guys.
> 
> When it's done by the cluster itself, eg. a node goes 'standby' - how
> do clusters migrate VirtualDomain resources?

1. Call resource agent migrate_to action on original node
2. Call resource agent migrate_from action on new node
3. Call resource agent stop action on original node

> Do users have any control over it and if so then how?

The allow-migrate resource meta-attribute (true/false)

> I'd imagine there must be some docs - I failed to find

It's sort of scattered throughout Pacemaker Explained -- the main one
is:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources

> Especially in large deployments one obvious question would be - I'm
> guessing as my setup is rather SOHO - can VMs migrate in sequence or
> it is(always?) a kind of 'swarm' migration?

The migration-limit cluster property specifies how many live migrations
may be initiated at once (the default of -1 means unlimited).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-18 Thread lejeczek via Users

Hi guys.

When it's done by the cluster itself, eg. a node goes 
'standby' - how do clusters migrate VirtualDomain resources?

Do users have any control over it and if so then how?
I'd imagine there must be some docs - I failed to find
Especially in large deployments one obvious question would 
be - I'm guessing as my setup is rather SOHO - can VMs 
migrate in sequence or it is(always?) a kind of 'swarm' 
migration?


many thanks, L.___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/