Re: [ClusterLabs] Resources always return to original node
Resource Stickiness for a group is the sum of all resources' resource stikiness -> 5 resources x 100 score (default stickiness) = 500 score. If your location constraint has a bigger number -> it wins :) Best Regards, Strahil Nikolov В събота, 26 септември 2020 г., 12:22:32 Гринуич+3, Michael Ivanov написа: Hallo, I have strange problem: when I reset the node on which my resources are running, they are correctly migrated to the other node. But when I turn the failed node back, then as soon as it is up all resources are returned back to it. I have set resource-stickiness default value to 100. When this did not help I have set up resource-stickiness meta attr also to 100 for all my resources. Still when the failed node recovers the resources are migrated back to it! Where should I look to try to understand this situation? Here's the configuration of my cluster: root@node1# pcs status Cluster name: gcluster Cluster Summary: * Stack: corosync * Current DC: node1 (version 2.0.4-2deceaa3ae) - partition with quorum * Last updated: Sat Sep 26 11:12:34 2020 * Last change: Sat Sep 26 10:39:16 2020 by root via cibadmin on node1 * 2 nodes configured * 14 resource instances configured (1 DISABLED) Node List: * Online: [ node1 node2 ] Full List of Resources: * ilo5_node1 (stonith:fence_ilo5_ssh): Started node2 * ilo5_node2 (stonith:fence_ilo5_ssh): Started node1 * Resource Group: VirtIP: * PrimaryIP (ocf::heartbeat:IPaddr2): Started node2 * PrimaryIP6 (ocf::heartbeat:IPv6addr): Started node2 * AliasIP (ocf::heartbeat:IPaddr2): Started node2 * BackupFS (ocf::redhat:netfs.sh): Started node2 * Clone Set: MailVolume-clone [MailVolume] (promotable): * Masters: [ node2 ] * Slaves: [ node1 ] * MailFS (ocf::heartbeat:Filesystem): Started node2 * apache (ocf::heartbeat:apache): Started node2 * postfix (ocf::heartbeat:postfix): Started node2 * amavis (service:amavis): Started node2 * dovecot (service:dovecot): Started node2 * openvpn (service:openvpn): Stopped (disabled) And resources: root@node1# pcs resource config Group: VirtIP Meta Attrs: resource-stickiness=100 Resource: PrimaryIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=16 ip=xx.xx.xx.20 nic=br0 Meta Attrs: resource-stickiness=100 Operations: monitor interval=30s (PrimaryIP-monitor-interval-30s) start interval=0s timeout=20s (PrimaryIP-start-interval-0s) stop interval=0s timeout=20s (PrimaryIP-stop-interval-0s) Resource: PrimaryIP6 (class=ocf provider=heartbeat type=IPv6addr) Attributes: cidr_netmask=64 ipv6addr=::::0:0:0:20 nic=br0 Meta Attrs: resource-stickiness=100 Operations: monitor interval=30s (PrimaryIP6-monitor-interval-30s) start interval=0s timeout=15s (PrimaryIP6-start-interval-0s) stop interval=0s timeout=15s (PrimaryIP6-stop-interval-0s) Resource: AliasIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=16 ip=xx.xx.yy.20 nic=br0 Meta Attrs: resource-stickiness=100 Operations: monitor interval=30s (AliasIP-monitor-interval-30s) start interval=0s timeout=20s (AliasIP-start-interval-0s) stop interval=0s timeout=20s (AliasIP-stop-interval-0s) Resource: BackupFS (class=ocf provider=redhat type=netfs.sh) Attributes: export=/Backup/Gateway fstype=nfs host=atlas mountpoint=/Backup options=noatime,async Meta Attrs: resource-stickiness=100 Operations: monitor interval=1m timeout=10 (BackupFS-monitor-interval-1m) monitor interval=5m timeout=30 OCF_CHECK_LEVEL=10 (BackupFS-monitor-interval-5m) monitor interval=10m timeout=30 OCF_CHECK_LEVEL=20 (BackupFS-monitor-interval-10m) start interval=0s timeout=900 (BackupFS-start-interval-0s) stop interval=0s timeout=30 (BackupFS-stop-interval-0s) Clone: MailVolume-clone Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1 resource-stickiness=100 Resource: MailVolume (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=mail Meta Attrs: resource-stickiness=100 Operations: demote interval=0s timeout=90 (MailVolume-demote-interval-0s) monitor interval=60s (MailVolume-monitor-interval-60s) notify interval=0s timeout=90 (MailVolume-notify-interval-0s) promote interval=0s timeout=90 (MailVolume-promote-interval-0s) reload interval=0s timeout=30 (MailVolume-reload-interval-0s) start interval=0s timeout=240 (MailVolume-start-interval-0s) stop interval=0s timeout=100 (MailVolume-stop-interval-0s) Resource: MailFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd0 directory=/var/mail fstype=btrfs Meta Attrs: resource-stickiness
Re: [ClusterLabs] Resources always return to original node
26.09.2020 12:22, Michael Ivanov пишет: > Hallo, > > I have strange problem: when I reset the node on which my resources are > running, > they are correctly migrated to the other node. But when I turn the failed > node > back, then as soon as it is up all resources are returned back to it. I have > set > resource-stickiness default value to 100. When this did not help I have set > up > resource-stickiness meta attr also to 100 for all my resources. Still when > the > failed node recovers the resources are migrated back to it! Where should I > look > to try to understand this situation? > The first thing to check are location and colocation constraints. > Here's the configuration of my cluster: > > root@node1# pcs status > Cluster name: gcluster > Cluster Summary: >* Stack: corosync >* Current DC: node1 (version 2.0.4-2deceaa3ae) - partition with quorum >* Last updated: Sat Sep 26 11:12:34 2020 >* Last change: Sat Sep 26 10:39:16 2020 by root via cibadmin on node1 >* 2 nodes configured >* 14 resource instances configured (1 DISABLED) > > Node List: >* Online: [ node1 node2 ] > > Full List of Resources: >* ilo5_node1(stonith:fence_ilo5_ssh): Started node2 >* ilo5_node2(stonith:fence_ilo5_ssh): Started node1 >* Resource Group: VirtIP: > * PrimaryIP(ocf::heartbeat:IPaddr2): Started node2 > * PrimaryIP6(ocf::heartbeat:IPv6addr): Started node2 > * AliasIP(ocf::heartbeat:IPaddr2): Started node2 >* BackupFS(ocf::redhat:netfs.sh): Started node2 >* Clone Set: MailVolume-clone [MailVolume] (promotable): > * Masters: [ node2 ] > * Slaves: [ node1 ] >* MailFS(ocf::heartbeat:Filesystem): Started node2 >* apache(ocf::heartbeat:apache): Started node2 >* postfix(ocf::heartbeat:postfix): Started node2 >* amavis(service:amavis): Started node2 >* dovecot(service:dovecot): Started node2 >* openvpn(service:openvpn): Stopped (disabled) > > And resources: > > root@node1# pcs resource config > Group: VirtIP >Meta Attrs: resource-stickiness=100 >Resource: PrimaryIP (class=ocf provider=heartbeat type=IPaddr2) > Attributes: cidr_netmask=16 ip=xx.xx.xx.20 nic=br0 > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=30s (PrimaryIP-monitor-interval-30s) > start interval=0s timeout=20s (PrimaryIP-start-interval-0s) > stop interval=0s timeout=20s (PrimaryIP-stop-interval-0s) >Resource: PrimaryIP6 (class=ocf provider=heartbeat type=IPv6addr) > Attributes: cidr_netmask=64 ipv6addr=::::0:0:0:20 nic=br0 > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=30s (PrimaryIP6-monitor-interval-30s) > start interval=0s timeout=15s (PrimaryIP6-start-interval-0s) > stop interval=0s timeout=15s (PrimaryIP6-stop-interval-0s) >Resource: AliasIP (class=ocf provider=heartbeat type=IPaddr2) > Attributes: cidr_netmask=16 ip=xx.xx.yy.20 nic=br0 > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=30s (AliasIP-monitor-interval-30s) > start interval=0s timeout=20s (AliasIP-start-interval-0s) > stop interval=0s timeout=20s (AliasIP-stop-interval-0s) > Resource: BackupFS (class=ocf provider=redhat type=netfs.sh) >Attributes: export=/Backup/Gateway fstype=nfs host=atlas > mountpoint=/Backup > options=noatime,async >Meta Attrs: resource-stickiness=100 >Operations: monitor interval=1m timeout=10 (BackupFS-monitor-interval-1m) >monitor interval=5m timeout=30 OCF_CHECK_LEVEL=10 > (BackupFS-monitor-interval-5m) >monitor interval=10m timeout=30 OCF_CHECK_LEVEL=20 > (BackupFS-monitor-interval-10m) >start interval=0s timeout=900 (BackupFS-start-interval-0s) >stop interval=0s timeout=30 (BackupFS-stop-interval-0s) > Clone: MailVolume-clone >Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true > promoted-max=1 promoted-node-max=1 resource-stickiness=100 >Resource: MailVolume (class=ocf provider=linbit type=drbd) > Attributes: drbd_resource=mail > Meta Attrs: resource-stickiness=100 > Operations: demote interval=0s timeout=90 (MailVolume-demote-interval-0s) > monitor interval=60s (MailVolume-monitor-interval-60s) > notify interval=0s timeout=90 (MailVolume-notify-interval-0s) > promote interval=0s timeout=90 > (MailVolume-promote-interval-0s) > reload interval=0s timeout=30 (MailVolume-reload-interval-0s) > start interval=0s timeout=240 (MailVolume-start-interval-0s) > stop interval=0s timeout=100 (MailVolume-stop-interval-0s) > Resource: MailFS (class=ocf provider=heartbeat type=Filesystem) >Attributes: device=/dev/drbd0 directory=/v
[ClusterLabs] Resources always return to original node
Hallo, I have strange problem: when I reset the node on which my resources are running, they are correctly migrated to the other node. But when I turn the failed node back, then as soon as it is up all resources are returned back to it. I have set resource-stickiness default value to 100. When this did not help I have set up resource-stickiness meta attr also to 100 for all my resources. Still when the failed node recovers the resources are migrated back to it! Where should I look to try to understand this situation? Here's the configuration of my cluster: root@node1# pcs status Cluster name: gcluster Cluster Summary: * Stack: corosync * Current DC: node1 (version 2.0.4-2deceaa3ae) - partition with quorum * Last updated: Sat Sep 26 11:12:34 2020 * Last change: Sat Sep 26 10:39:16 2020 by root via cibadmin on node1 * 2 nodes configured * 14 resource instances configured (1 DISABLED) Node List: * Online: [ node1 node2 ] Full List of Resources: * ilo5_node1 (stonith:fence_ilo5_ssh): Started node2 * ilo5_node2 (stonith:fence_ilo5_ssh): Started node1 * Resource Group: VirtIP: * PrimaryIP (ocf::heartbeat:IPaddr2): Started node2 * PrimaryIP6 (ocf::heartbeat:IPv6addr): Started node2 * AliasIP (ocf::heartbeat:IPaddr2): Started node2 * BackupFS (ocf::redhat:netfs.sh): Started node2 * Clone Set: MailVolume-clone [MailVolume] (promotable): * Masters: [ node2 ] * Slaves: [ node1 ] * MailFS (ocf::heartbeat:Filesystem): Started node2 * apache (ocf::heartbeat:apache): Started node2 * postfix (ocf::heartbeat:postfix): Started node2 * amavis (service:amavis): Started node2 * dovecot (service:dovecot): Started node2 * openvpn (service:openvpn): Stopped (disabled) And resources: root@node1# pcs resource config Group: VirtIP Meta Attrs: resource-stickiness=100 Resource: PrimaryIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=16 ip=xx.xx.xx.20 nic=br0 Meta Attrs: resource-stickiness=100 Operations: monitor interval=30s (PrimaryIP-monitor-interval-30s) start interval=0s timeout=20s (PrimaryIP-start-interval-0s) stop interval=0s timeout=20s (PrimaryIP-stop-interval-0s) Resource: PrimaryIP6 (class=ocf provider=heartbeat type=IPv6addr) Attributes: cidr_netmask=64 ipv6addr=::::0:0:0:20 nic=br0 Meta Attrs: resource-stickiness=100 Operations: monitor interval=30s (PrimaryIP6-monitor-interval-30s) start interval=0s timeout=15s (PrimaryIP6-start-interval-0s) stop interval=0s timeout=15s (PrimaryIP6-stop-interval-0s) Resource: AliasIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=16 ip=xx.xx.yy.20 nic=br0 Meta Attrs: resource-stickiness=100 Operations: monitor interval=30s (AliasIP-monitor-interval-30s) start interval=0s timeout=20s (AliasIP-start-interval-0s) stop interval=0s timeout=20s (AliasIP-stop-interval-0s) Resource: BackupFS (class=ocf provider=redhat type=netfs.sh) Attributes: export=/Backup/Gateway fstype=nfs host=atlas mountpoint=/Backup options=noatime,async Meta Attrs: resource-stickiness=100 Operations: monitor interval=1m timeout=10 (BackupFS-monitor-interval-1m) monitor interval=5m timeout=30 OCF_CHECK_LEVEL=10 (BackupFS-monitor-interval-5m) monitor interval=10m timeout=30 OCF_CHECK_LEVEL=20 (BackupFS-monitor-interval-10m) start interval=0s timeout=900 (BackupFS-start-interval-0s) stop interval=0s timeout=30 (BackupFS-stop-interval-0s) Clone: MailVolume-clone Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1 resource-stickiness=100 Resource: MailVolume (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=mail Meta Attrs: resource-stickiness=100 Operations: demote interval=0s timeout=90 (MailVolume-demote-interval-0s) monitor interval=60s (MailVolume-monitor-interval-60s) notify interval=0s timeout=90 (MailVolume-notify-interval-0s) promote interval=0s timeout=90 (MailVolume-promote-interval-0s) reload interval=0s timeout=30 (MailVolume-reload-interval-0s) start interval=0s timeout=240 (MailVolume-start-interval-0s) stop interval=0s timeout=100 (MailVolume-stop-interval-0s) Resource: MailFS (class=ocf pr