date:20210113

[ClusterLabs] Antw: [EXT] Cluster breaks after pcs unstandby node

2021-01-13 Thread Ulrich Windl

Hi!

I'm using SLES, but I think your configuration misses many colocations (IMHO
every ordering should have a correspoonding colocation).

From the logs of node1, this looks odd to me:
attrd[11024]:error: Connection to the CPG API failed: Library error (2)

After
systemd[1]: Unit pacemaker.service entered failed state.
it's expected that the node be fenced.

However this is not fencing IMHO:
Jan 04 13:59:04 kvm03-node01 systemd-logind[5456]: Power key pressed.
Jan 04 13:59:04 kvm03-node01 systemd-logind[5456]: Powering Off...

The main question is what makes the cluster think the node is lost:
Jan 04 13:58:27 kvm03-node01 corosync[10995]:  [TOTEM ] A processor failed,
forming new configuration.
Jan 04 13:58:27 kvm03-node02 corosync[28814]:  [TOTEM ] A processor failed,
forming new configuration.

The answer seems to be node3:
Jan 04 13:58:07 kvm03-node03 crmd[37819]:   notice: Initiating monitor
operation ipmi-fencing-node02_monitor_6 on kvm03-node02.avigol-gcs.dk
Jan 04 13:58:07 kvm03-node03 crmd[37819]:   notice: Initiating monitor
operation ipmi-fencing-node03_monitor_6 on kvm03-node01.avigol-gcs.dk
Jan 04 13:58:25 kvm03-node03 corosync[37794]:  [TOTEM ] A new membership
(172.31.0.31:1044) was formed. Members
Jan 04 13:58:25 kvm03-node03 corosync[37794]:  [CPG   ] downlist left_list: 0
received
Jan 04 13:58:25 kvm03-node03 corosync[37794]:  [CPG   ] downlist left_list: 0
received
Jan 04 13:58:25 kvm03-node03 corosync[37794]:  [CPG   ] downlist left_list: 0
received
Jan 04 13:58:27 kvm03-node03 corosync[37794]:  [TOTEM ] A processor failed,
forming new configuration.

Before:
Jan 04 13:54:18 kvm03-node03 crmd[37819]:   notice: Node
kvm03-node02.avigol-gcs.dk state is now lost
Jan 04 13:54:18 kvm03-node03 crmd[37819]:   notice: Node
kvm03-node02.avigol-gcs.dk state is now lost

No idea why, but then:
Jan 04 13:54:18 kvm03-node03 crmd[37819]:   notice: Node
kvm03-node02.avigol-gcs.dk state is now lost
Why "shutdown" and not "fencing"?

(A side-note on "pe-input-497.bz2": You may want to limit the number of policy
files being kept; here I use 100 as limit)
Node2 then seems to have rejoined before being fenced:
Jan 04 13:57:21 kvm03-node03 crmd[37819]:   notice: State transition S_IDLE ->
S_POLICY_ENGINE

The node3 seems unavailable, moding resource to node2:
Jan 04 13:58:07 kvm03-node03 crmd[37819]:   notice: State transition S_IDLE ->
S_POLICY_ENGINE
Jan 04 13:58:07 kvm03-node03 pengine[37818]:   notice:  * Move  
ipmi-fencing-node02 ( kvm03-node03.avigol-gcs.dk ->
kvm03-node02.avigol-gcs.dk )
Jan 04 13:58:07 kvm03-node03 pengine[37818]:   notice:  * Move  
ipmi-fencing-node03 ( kvm03-node03.avigol-gcs.dk ->
kvm03-node01.avigol-gcs.dk )
Jan 04 13:58:07 kvm03-node03 pengine[37818]:   notice:  * Stop   dlm:2
  (   kvm03-node03.avigol-gcs.dk )  
due to node availability


Then node1 seems gone:
Jan 04 13:58:27 kvm03-node03 corosync[37794]:  [TOTEM ] A processor failed,
forming new configuration.

The suddenly node-1 is here again:
Jan 04 13:58:33 kvm03-node03 crmd[37819]:   notice: Stonith/shutdown of
kvm03-node01.avigol-gcs.dk not matched
Jan 04 13:58:33 kvm03-node03 crmd[37819]:   notice: Transition aborted: Node
failure
Jan 04 13:58:33 kvm03-node03 cib[37814]:   notice: Node
kvm03-node01.avigol-gcs.dk state is now member
Jan 04 13:58:33 kvm03-node03 attrd[37817]:   notice: Node
kvm03-node01.avigol-gcs.dk state is now member
Jan 04 13:58:33 kvm03-node03 dlm_controld[39252]: 5452 cpg_mcast_joined retry
300 plock
Jan 04 13:58:33 kvm03-node03 stonith-ng[37815]:   notice: Node
kvm03-node01.avigol-gcs.dk state is now member

And it's lost again:
Jan 04 13:58:33 kvm03-node03 attrd[37817]:   notice: Node
kvm03-node01.avigol-gcs.dk state is now lost
Jan 04 13:58:33 kvm03-node03 cib[37814]:   notice: Node
kvm03-node01.avigol-gcs.dk state is now lost

Jan 04 13:58:33 kvm03-node03 crmd[37819]:  warning: No reason to expect node 1
to be down
Jan 04 13:58:33 kvm03-node03 crmd[37819]:   notice: Stonith/shutdown of
kvm03-node01.avigol-gcs.dk not matched

Then it seems only node1 can fence node1, but communication with node1 is
lost:
Jan 04 13:59:03 kvm03-node03 stonith-ng[37815]:   notice: ipmi-fencing-node02
can not fence (reboot) kvm03-node01.avigol-gcs.dk: static-list
Jan 04 13:59:03 kvm03-node03 stonith-ng[37815]:   notice: ipmi-fencing-node03
can not fence (reboot) kvm03-node01.avigol-gcs.dk: static-list
Jan 04 13:59:03 kvm03-node03 stonith-ng[37815]:   notice: ipmi-fencing-node01
can fence (reboot) kvm03-node01.avigol-gcs.dk: static-list
Jan 04 13:59:03 kvm03-node03 stonith-ng[37815]:   notice: ipmi-fencing-node02
can not fence (reboot) kvm03-node01.avigol-gcs.dk: static-list
Jan 04 13:59:03 kvm03-node03 stonith-ng[37815]:   notice: ipmi-fencing-node03
can not fence (reboot) kvm03-node01.avigol-gcs.dk: static-list
Jan 04 13:59:03 kvm03-node03 stonith-ng[37815]:   notice: ipmi-fencing-node01
can fence (reboot) kvm03-node01.avigol-gcs.dk: sta

Re: [ClusterLabs] How to set up "active-active" cluster by balancing multiple exports across servers?

2021-01-13 Thread Vladislav Bogdanov

Hi.

I would run nfsserver and nfsnotify as a separate cloned group and make
both other groups colocated/ordered with it.
So nfs server will be just a per-host service, and then you attach
exports (with LVs, filesystems, ip addresses) to it.
NFS server in linux is an in-kernel creature, not an userspace process,
and it is not designed to have several instances bound to different
addresses. But with the approach above you can overcome that.

On Tue, 2021-01-12 at 11:04 -0700, Billy Wilson wrote:
> I'm having trouble setting up what seems like should be a 
> straightforward NFS-HA design. It is similar to what Christoforos 
> Christoforou attempted to do earlier in 2020 
> (https://www.mail-archive.com/users@clusterlabs.org/msg09671.html).
> 
> My goal is to balance multiple NFS exports across two nodes to 
> effectively have an "active-active" configuration. Each export should
> only be available from one node at a time, but they should be able to
> freely fail back and forth to balance between the two nodes.
> 
> I'm also hoping to isolate each exported filesystem to its own set of
> underlying disks, to prevent heavy IO on one exported filesystem from
> affecting another one. So each filesystem to be exported should be 
> backed by a unique volume group.
> 
> I've set up two nodes with fencing, an ethmonitor clone, and the 
> following two resource groups.
> 
> """
>    * Resource Group: ha1:
>  * alice_lvm    (ocf::heartbeat:LVM-activate):    Started host1
>  * alice_xfs    (ocf::heartbeat:Filesystem):    Started host1
>  * alice_nfs    (ocf::heartbeat:nfsserver):    Started host1
>  * alice_ip    (ocf::heartbeat:IPaddr2):    Started host1
>  * alice_nfsnotify    (ocf::heartbeat:nfsnotify):    Started
> host1
>  * alice_login01    (ocf::heartbeat:exportfs):    Started host1
>  * alice_login02    (ocf::heartbeat:exportfs):    Started host1
>    * Resource Group: ha2:
>  * bob_lvm    (ocf::heartbeat:LVM-activate):    Started host2
>  * bob_xfs    (ocf::heartbeat:Filesystem):    Started host2
>  * bob_nfs    (ocf::heartbeat:nfsserver):    Started host2
>  * bob_ip    (ocf::heartbeat:IPaddr2):    Started host2
>  * bob_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
>  * bob_login01    (ocf::heartbeat:exportfs):    Started host2
>  * bob_login02    (ocf::heartbeat:exportfs):    Started host2
> """
> 
> We had an older storage appliance that used Red Hat HA on RHEL 6
> (back 
> when it still used RGManager and not Pacemaker), and it was capable
> of 
> load-balanced NFS-HA like this.
> 
> The problem with this approach using Pacemaker is that the
> "nfsserver" 
> resource agent only wants one instance per host. During a failover 
> event, both "nfsserver" RAs will try to bind mount the NFS shared
> info 
> directory to /var/lib/nfs/. Only one will claim the directory.
> 
> If I convert everything to a single resource group as Christoforos
> did, 
> then the cluster is active-passive, and all the resources fail as a 
> single unit. Having one node serve all the exports while the other is
> idle doesn't seem very ideal.
> 
> I'd like to eventually have something like this:
> 
> """
>    * Resource Group: ha1:
>  * alice_lvm    (ocf::heartbeat:LVM-activate):    Started host1
>  * alice_xfs    (ocf::heartbeat:Filesystem):    Started host1
>  * charlie_lvm    (ocf::heartbeat:LVM-activate):    Started host1
>  * charlie_xfs    (ocf::heartbeat:Filesystem):    Started host1
>  * ha1_nfs    (ocf::heartbeat:nfsserver):    Started host1
>  * alice_ip    (ocf::heartbeat:IPaddr2):    Started host1
>  * charlie_ip    (ocf::heartbeat:IPaddr2):    Started host1
>  * ha1_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host1
>  * alice_login01    (ocf::heartbeat:exportfs):    Started host1
>  * alice_login02    (ocf::heartbeat:exportfs):    Started host1
>  * charlie_login01    (ocf::heartbeat:exportfs):    Started host1
>  * charlie_login02    (ocf::heartbeat:exportfs):    Started host1
>    * Resource Group: ha2:
>  * bob_lvm    (ocf::heartbeat:LVM-activate):    Started host2
>  * bob_xfs    (ocf::heartbeat:Filesystem):    Started host2
>  * david_lvm    (ocf::heartbeat:LVM-activate):    Started host2
>  * david_xfs    (ocf::heartbeat:Filesystem):    Started host2
>  * ha2_nfs    (ocf::heartbeat:nfsserver):    Started host2
>  * bob_ip    (ocf::heartbeat:IPaddr2):    Started host2
>  * david_ip    (ocf::heartbeat:IPaddr2):    Started host2
>  * ha2_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
>  * bob_login01    (ocf::heartbeat:exportfs):    Started host2
>  * bob_login02    (ocf::heartbeat:exportfs):    Started host2
>  * david_login01    (ocf::heartbeat:exportfs):    Started host2
>  * david_login02    (ocf::heartbeat:exportfs):    Started host2
> """
> 
> Or even this:
> 
> """
>    * Resource Group: alice_research:
>  * alice_lvm    (ocf::he

[ClusterLabs] How to set up "active-active" cluster by balancing multiple exports across servers?

2021-01-13 Thread Billy Wilson

I'm having trouble setting up what seems like should be a 
straightforward NFS-HA design. It is similar to what Christoforos 
Christoforou attempted to do earlier in 2020 
(https://www.mail-archive.com/users@clusterlabs.org/msg09671.html).


My goal is to balance multiple NFS exports across two nodes to 
effectively have an "active-active" configuration. Each export should 
only be available from one node at a time, but they should be able to 
freely fail back and forth to balance between the two nodes.


I'm also hoping to isolate each exported filesystem to its own set of 
underlying disks, to prevent heavy IO on one exported filesystem from 
affecting another one. So each filesystem to be exported should be 
backed by a unique volume group.


I've set up two nodes with fencing, an ethmonitor clone, and the 
following two resource groups.


"""
  * Resource Group: ha1:
* alice_lvm(ocf::heartbeat:LVM-activate):Started host1
* alice_xfs(ocf::heartbeat:Filesystem):Started host1
* alice_nfs(ocf::heartbeat:nfsserver):Started host1
* alice_ip(ocf::heartbeat:IPaddr2):Started host1
* alice_nfsnotify(ocf::heartbeat:nfsnotify):Started host1
* alice_login01(ocf::heartbeat:exportfs):Started host1
* alice_login02(ocf::heartbeat:exportfs):Started host1
  * Resource Group: ha2:
* bob_lvm(ocf::heartbeat:LVM-activate):Started host2
* bob_xfs(ocf::heartbeat:Filesystem):Started host2
* bob_nfs(ocf::heartbeat:nfsserver):Started host2
* bob_ip(ocf::heartbeat:IPaddr2):Started host2
* bob_nfsnotify(ocf::heartbeat:nfsnotify):Started host2
* bob_login01(ocf::heartbeat:exportfs):Started host2
* bob_login02(ocf::heartbeat:exportfs):Started host2
"""

We had an older storage appliance that used Red Hat HA on RHEL 6 (back 
when it still used RGManager and not Pacemaker), and it was capable of 
load-balanced NFS-HA like this.


The problem with this approach using Pacemaker is that the "nfsserver" 
resource agent only wants one instance per host. During a failover 
event, both "nfsserver" RAs will try to bind mount the NFS shared info 
directory to /var/lib/nfs/. Only one will claim the directory.


If I convert everything to a single resource group as Christoforos did, 
then the cluster is active-passive, and all the resources fail as a 
single unit. Having one node serve all the exports while the other is 
idle doesn't seem very ideal.


I'd like to eventually have something like this:

"""
  * Resource Group: ha1:
* alice_lvm(ocf::heartbeat:LVM-activate):Started host1
* alice_xfs(ocf::heartbeat:Filesystem):Started host1
* charlie_lvm(ocf::heartbeat:LVM-activate):Started host1
* charlie_xfs(ocf::heartbeat:Filesystem):Started host1
* ha1_nfs(ocf::heartbeat:nfsserver):Started host1
* alice_ip(ocf::heartbeat:IPaddr2):Started host1
* charlie_ip(ocf::heartbeat:IPaddr2):Started host1
* ha1_nfsnotify(ocf::heartbeat:nfsnotify):Started host1
* alice_login01(ocf::heartbeat:exportfs):Started host1
* alice_login02(ocf::heartbeat:exportfs):Started host1
* charlie_login01(ocf::heartbeat:exportfs):Started host1
* charlie_login02(ocf::heartbeat:exportfs):Started host1
  * Resource Group: ha2:
* bob_lvm(ocf::heartbeat:LVM-activate):Started host2
* bob_xfs(ocf::heartbeat:Filesystem):Started host2
* david_lvm(ocf::heartbeat:LVM-activate):Started host2
* david_xfs(ocf::heartbeat:Filesystem):Started host2
* ha2_nfs(ocf::heartbeat:nfsserver):Started host2
* bob_ip(ocf::heartbeat:IPaddr2):Started host2
* david_ip(ocf::heartbeat:IPaddr2):Started host2
* ha2_nfsnotify(ocf::heartbeat:nfsnotify):Started host2
* bob_login01(ocf::heartbeat:exportfs):Started host2
* bob_login02(ocf::heartbeat:exportfs):Started host2
* david_login01(ocf::heartbeat:exportfs):Started host2
* david_login02(ocf::heartbeat:exportfs):Started host2
"""

Or even this:

"""
  * Resource Group: alice_research:
* alice_lvm(ocf::heartbeat:LVM-activate):Started host1
* alice_xfs(ocf::heartbeat:Filesystem):Started host1
* alice_nfs(ocf::heartbeat:nfsserver):Started host1
* alice_ip(ocf::heartbeat:IPaddr2):Started host1
* alice_nfsnotify(ocf::heartbeat:nfsnotify):Started host1
* alice_login01(ocf::heartbeat:exportfs):Started host1
* alice_login02(ocf::heartbeat:exportfs):Started host1
  * Resource Group: charlie_research:
* charlie_lvm(ocf::heartbeat:LVM-activate):Started host1
* charlie_xfs(ocf::heartbeat:Filesystem):Started host1
* charlie_nfs(ocf::heartbeat:nfsserver):Started host1
* charlie_ip(ocf::heartbeat:IPaddr2)

Re: [ClusterLabs] Q: List resources affected by utilization limits

2021-01-13 Thread Gao,Yan


On 1/13/21 9:14 AM, Ulrich Windl wrote:

Hi!

I had made a test: I had configured RAM requirements for some test VMs together 
with node RAM capacities. Things were running fine.
Then as a test I reduced the RAM capacity of all nodes, and test VMs were 
stopped due to not enough RAM.
Now I wonder: is there a command that can list those resources that couldn't start 
because of "not enough nod capacity"?
Preferrably combined with the utilization attribute that could not be fulfilled?


crm_simulate -LU should give some hints.

Regards,
  Yan




Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: [EXT] Re: Questions about the infamous TOTEM retransmit list

2021-01-13 Thread Roger Zhou




On 1/13/21 3:31 PM, Ulrich Windl wrote:

Roger Zhou  schrieb am 13.01.2021 um 05:32 in Nachricht

<97ac2305-85b4-cbb0-7133-ac1372143...@suse.com>:

On 1/12/21 4:23 PM, Ulrich Windl wrote:

Hi!

Before setting up our first pacemaker cluster we thought one low-speed

redundant network would be good in addition to the normal high-speed network.

However as is seems now (SLES15 SP2) there is NO reasonable RRP mode to

drive such a configuration with corosync.


Passive RRP mode with UDPU still sends each packet through both nets,


Indeed, packets are sent in the round-robin fashion.


being throttled by the slower network.
(Originally we were using multicast, but that was even worse)

Now I realized that even under modest load, I see messages about "retransmit

list", like this:

Jan 08 10:57:56 h16 corosync[3562]:   [TOTEM ] Retransmit List: 3e2
Jan 08 10:57:56 h16 corosync[3562]:   [TOTEM ] Retransmit List: 3e2 3e4
Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 60e 610 612

614

Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 610 614
Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 614
Jan 08 11:13:41 h16 corosync[3562]:   [TOTEM ] Retransmit List: 6ed



What's the latency of this low speed link?


The normal net is fibre-based:
4 packets transmitted, 4 received, 0% packet loss, time 3058ms
rtt min/avg/max/mdev = 0.131/0.175/0.205/0.027 ms

The redundant net is copper-based:
5 packets transmitted, 5 received, 0% packet loss, time 4104ms
rtt min/avg/max/mdev = 0.293/0.304/0.325/0.019 ms



Aha, RTT < 1ms, the network is fast enough. It clear my doubt to guess the 
latency of the slow link might even in tens or even hundred ms level. Then, I 
might wonder if corosync packet get the bad luck and get delayed due to 
workload on one of the link.





Questions on that:
Will the situation be much better with knet?


knet provides "link_mode: passive" could fit your thought slightly which is
not
round-robin. But, it still doesn't fit your game well, since knet assumes
the
similar latency among links again. You may have to tune parameters for the
low
speed link and likely sacrifice the benefit from the fast link.


Well in the past when using HP Service Guard, everything was working quite 
differently:
There was a true heartbeat on each cluster net, determining ist "being alive", 
and when the cluster performed no action there was no traffic on the cluster links 
(except that heartbeat).
When the cluster actually had to talk, it was using the link that was flagged 
"alive" with a preference of primary first, then secondary when both were 
available.



"link_mode: passive" together with knet_link_priority would be useful. Also, 
use sctp in knet could be the alternative too.


Cheers,
Roger

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: A bug? (SLES15 SP2 with "crm resource refresh")

2021-01-13 Thread Klaus Wenninger

On 1/12/21 8:23 AM, Ulrich Windl wrote:
 Ken Gaillot  schrieb am 11.01.2021 um 16:45 in 
 Nachricht
> <3e78312a1c92cde0a1cdd82c2fed33a679f63770.ca...@redhat.com>:
>
> ...
 from growing indefinitely). (Plus some timing issues to consider.)
>>> Wouldn't a temporary local status variable do also?
> Hi Ken,
>
> I appreciate your comments.
>  
>> No, the scheduler is stateless. All information that the scheduler
>> needs must be contained within the CIB.
>>
>> The main advantages of that approach are (1) the scheduler can crash
>> and respawn without causing any problems; (2) the DC can be changed to
> I think it's nice when being able to recover smoothly after a crash, but 
> program design should not be biased towards frequent crashes ;-)
>
>> another node at any time without causing any problems; and (3) saved
> Well, if every status update is stored in the CIB (as it seems to be), 
> changing DCs shouln't be a bug problem until there are multiple at the same 
> time.
>
>> CIBs can be replayed for debugging and testing purposes with the
>> identical result as a live cluster.
> Are you talking about the whole CIB, or about the configuration section of 
> the CIB? I can't see any sense of replacing the status section of the CIB 
> unless you want to debug resource recovery and probing.
That is the whole CIB. All the scheduler regression tests are working like
that. Feed the CIB into crm_simulate and see what it does.

Klaus
>
> ...
>
> Regards,
> Ulrich
>
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Questions about the infamous TOTEM retransmit list

2021-01-13 Thread Roger Zhou


On 1/12/21 4:23 PM, Ulrich Windl wrote:

Hi!

Before setting up our first pacemaker cluster we thought one low-speed 
redundant network would be good in addition to the normal high-speed network.
However as is seems now (SLES15 SP2) there is NO reasonable RRP mode to drive 
such a configuration with corosync.

Passive RRP mode with UDPU still sends each packet through both nets, 


Indeed, packets are sent in the round-robin fashion.


being throttled by the slower network.
(Originally we were using multicast, but that was even worse)

Now I realized that even under modest load, I see messages about "retransmit 
list", like this:
Jan 08 10:57:56 h16 corosync[3562]:   [TOTEM ] Retransmit List: 3e2
Jan 08 10:57:56 h16 corosync[3562]:   [TOTEM ] Retransmit List: 3e2 3e4
Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 60e 610 612 614
Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 610 614
Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 614
Jan 08 11:13:41 h16 corosync[3562]:   [TOTEM ] Retransmit List: 6ed



What's the latency of this low speed link?

I guess it is rather large, and probably not qualified for the use unless 
modify the default corosync.conf carefully. Put it in another way around, 
corosync mostly works for the local network with the small latency by default. 
Also, it is not designed for links with large different latency.



Questions on that:
Will the situation be much better with knet?


knet provides "link_mode: passive" could fit your thought slightly which is not 
round-robin. But, it still doesn't fit your game well, since knet assumes the 
similar latency among links again. You may have to tune parameters for the low 
speed link and likely sacrifice the benefit from the fast link.



Is there a smooth migration path from UDPU to knet?


Out of my head, corosync3 need restart when switching from "transport: udpu" to 
 "transport: knet".


Cheers,
Roger

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Q: List resources affected by utilization limits

2021-01-13 Thread Ulrich Windl

Hi!

I had made a test: I had configured RAM requirements for some test VMs together 
with node RAM capacities. Things were running fine.
Then as a test I reduced the RAM capacity of all nodes, and test VMs were 
stopped due to not enough RAM.
Now I wonder: is there a command that can list those resources that couldn't 
start because of "not enough nod capacity"?
Preferrably combined with the utilization attribute that could not be fulfilled?

Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Antw: [EXT] Cluster breaks after pcs unstandby node

Re: [ClusterLabs] How to set up "active-active" cluster by balancing multiple exports across servers?

[ClusterLabs] How to set up "active-active" cluster by balancing multiple exports across servers?

Re: [ClusterLabs] Q: List resources affected by utilization limits

Re: [ClusterLabs] Antw: [EXT] Re: Questions about the infamous TOTEM retransmit list

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: A bug? (SLES15 SP2 with "crm resource refresh")

Re: [ClusterLabs] Questions about the infamous TOTEM retransmit list

[ClusterLabs] Q: List resources affected by utilization limits

8 matches

Site Navigation

Mail list logo

Footer information