Re: [ClusterLabs] HA-Cluster, UPS and power outage - how is your setup ?
On 2022-02-01 11:16, Lentes, Bernd wrote: Hi, we just experienced two power outages in a few days. This showed me that our UPS configuration and the handling of resources on the cluster is insufficient. We have a two-node cluster with SLES 12 SP5 and a Smart-UPS SRT 3000 from APC with Network Management Card. The UPS is able to buffer the two nodes and some Hardware (SAN, Monitor) for about one hour. Our resources are Virtual Domains, about 20 of different flavor and version. Our primary goal is not to bypass as long as possible a power outage but to shutdown all domains correctly after a dedicated time. I'm currently thinking of waiting for a dedicated time (maybe 15 minutes) and then do a "crm resource stop VirtualDomains" in a script. I would give the cluster some time for the shutdown (5-10 minutes) and afterwards shutdown the nodes (via script). I have to keep an eye on if both nodes are running or only one of them. How is your approach ? Bernd I don't know if this will be a useful answer for you, but I haven't seen anyone else reply. In the Anvil!, we use SNMP to collect data on APC UPSes powering a given cluster. The OIDs we read are at the head of this file, but the logic to read and collect the data starts here; https://github.com/ClusterLabs/anvil/blob/main/scancore-agents/scan-apc-ups/scan-apc-ups#L3026 Some processing happens in-agent, but mainly the collected data is written to a generic "power" table (as we support any UPS we can collect data from). When we're done scanning, we analyze the data in the 'power' table to decide if we need to shed load (withdraw and power off nodes to extend runtime), do a complete graceful shutdown (if the batteries are about to die), or reboot the nodes after power is restored. This logic is handled mainly here. First, we figure out which UPS powers which nodes/clusters, then we pull the data on those specific UPSes to return a general "power state". https://github.com/ClusterLabs/anvil/blob/main/Anvil/Tools/ScanCore.pm#L607 The power state then tells the main daemon what actions to take, if any (load shed, shut down, restart). That's here; https://github.com/ClusterLabs/anvil/blob/main/Anvil/Tools/ScanCore.pm#L1541 This is super high level, and much of the specifics are related to the Anvil! cluster, but it hopefully gives you a starting point on how to approach the problem. We've been doing it this way for many years with really good effect. Cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Removing a resource without stopping it
I think I have a working method, though not in the conditions first explained. Though I would love to have feedback on it's sanity. The ultimate goal is to migrate the resource (VM) to a different pacemaker cluster. Setting it to be unmanaged, migrating the VM off, setting the resource to disabled, and managing the resource again marks it as stopped, then it can be deleted. [root@an-a01n01 ~]# pcs resource unmanage srv01-cs8 # Migrate the server to another pacemaker cluster here [root@an-a01n01 ~]# pcs resource disable srv01-cs8 Warning: 'srv01-cs8' is unmanaged [root@an-a01n01 ~]# pcs resource manage srv01-cs8 [root@an-a01n01 ~]# pcs resource delete srv01-cs8 Deleting Resource - srv01-cs8 Though going back to the original question, deleting the server from pacemaker while the VM is left running, is still something I am quite curious about. Madi On 2022-01-29 13:27, Strahil Nikolov wrote: I know... and the editor stuff can be bypassed, if the approach works. Best Regards, Strahil Nikolov On Sat, Jan 29, 2022 at 15:43, Digimer wrote: On 2022-01-29 03:16, Strahil Nikolov wrote: I think there is pcs cluster edit --scope=resources (based on memory). Can you try to delete it from there ? Best Regards, Strahil Nikolov Thanks, but no that doesn't seem to work. 'pcs cluster edit' wants to open an editor, and I'm trying to find a way to make this change with a program (once I sort out the manual process). So an option that requires user input won't work in my case regardless. Thank you just the same though! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Removing a resource without stopping it
On 2022-01-29 03:16, Strahil Nikolov wrote: I think there is pcs cluster edit --scope=resources (based on memory). Can you try to delete it from there ? Best Regards, Strahil Nikolov Thanks, but no that doesn't seem to work. 'pcs cluster edit' wants to open an editor, and I'm trying to find a way to make this change with a program (once I sort out the manual process). So an option that requires user input won't work in my case regardless. Thank you just the same though! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Removing a resource without stopping it
On 2022-01-29 00:10, Digimer wrote: On 2022-01-28 16:54, Ken Gaillot wrote: On Fri, 2022-01-28 at 16:38 -0500, Digimer wrote: Hi all, I'm trying to figure out how to move a running VM from one pacemaker cluster to another. I've got the storage and VM live migration sorted, but having trouble with pacemaker. I tried unmanaging the resource (the VM), then deleted the resource, and the node got fenced. So I am assuming it thought it couldn't stop the service so it self-fenced. In any case, can someone let me know what the proper procedure is? Said more directly; How to I delete a resource from pacemaker (via pcs on EL8) without stopping the resource? Set the stop-orphan-resources cluster property to false (at least while you move it) The problem with your first approach is that once you remove the resource configuration, which includes the is-managed setting, Pacemaker no longer knows the resource is unmanaged. And even if you set it via resource defaults or something, eventually you have to set it back, at which point Pacemaker will still have the same response. Follow up; I tried to do the following sequence; pcs property set stop-orphan-resources=false pcs resource unmanage srv01-cs8 # Without this, the resource was stopped pcs resource delete srv01-cs8 # Failed with "Warning: 'srv01-cs8' is unmanaged" pcs resource delete srv01-cs8 --force # Got 'Deleting Resource - srv01-cs8' pcs resource status -- * srv01-cs8 (ocf::alteeve:server): ORPHANED Started an-a01n01 (unmanaged) -- So it seems like this doesn't delete the resource. Can I get some insight on how to actually delete this resource without disabling the VM? Thanks! Adding; I tried 'pcs property set stop-orphan-resources=true' and it stopped the VM and then actually deleted the resource. =/ -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Removing a resource without stopping it
On 2022-01-28 16:54, Ken Gaillot wrote: On Fri, 2022-01-28 at 16:38 -0500, Digimer wrote: Hi all, I'm trying to figure out how to move a running VM from one pacemaker cluster to another. I've got the storage and VM live migration sorted, but having trouble with pacemaker. I tried unmanaging the resource (the VM), then deleted the resource, and the node got fenced. So I am assuming it thought it couldn't stop the service so it self-fenced. In any case, can someone let me know what the proper procedure is? Said more directly; How to I delete a resource from pacemaker (via pcs on EL8) without stopping the resource? Set the stop-orphan-resources cluster property to false (at least while you move it) The problem with your first approach is that once you remove the resource configuration, which includes the is-managed setting, Pacemaker no longer knows the resource is unmanaged. And even if you set it via resource defaults or something, eventually you have to set it back, at which point Pacemaker will still have the same response. Follow up; I tried to do the following sequence; pcs property set stop-orphan-resources=false pcs resource unmanage srv01-cs8 # Without this, the resource was stopped pcs resource delete srv01-cs8 # Failed with "Warning: 'srv01-cs8' is unmanaged" pcs resource delete srv01-cs8 --force # Got 'Deleting Resource - srv01-cs8' pcs resource status -- * srv01-cs8 (ocf::alteeve:server): ORPHANED Started an-a01n01 (unmanaged) -- So it seems like this doesn't delete the resource. Can I get some insight on how to actually delete this resource without disabling the VM? Thanks! digimer -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Removing a resource without stopping it
On 2022-01-28 16:54, Ken Gaillot wrote: On Fri, 2022-01-28 at 16:38 -0500, Digimer wrote: Hi all, I'm trying to figure out how to move a running VM from one pacemaker cluster to another. I've got the storage and VM live migration sorted, but having trouble with pacemaker. I tried unmanaging the resource (the VM), then deleted the resource, and the node got fenced. So I am assuming it thought it couldn't stop the service so it self-fenced. In any case, can someone let me know what the proper procedure is? Said more directly; How to I delete a resource from pacemaker (via pcs on EL8) without stopping the resource? Set the stop-orphan-resources cluster property to false (at least while you move it) The problem with your first approach is that once you remove the resource configuration, which includes the is-managed setting, Pacemaker no longer knows the resource is unmanaged. And even if you set it via resource defaults or something, eventually you have to set it back, at which point Pacemaker will still have the same response. Thanks for this! I'm not entirely sure I understand the implications of "stop-orphan-resources". I assume it would be a bad idea to set it to "false" and leave it that way, given it's not the default. What's the purpose of this being set to 'true'? Thanks! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Removing a resource without stopping it
Hi all, I'm trying to figure out how to move a running VM from one pacemaker cluster to another. I've got the storage and VM live migration sorted, but having trouble with pacemaker. I tried unmanaging the resource (the VM), then deleted the resource, and the node got fenced. So I am assuming it thought it couldn't stop the service so it self-fenced. In any case, can someone let me know what the proper procedure is? Said more directly; How to I delete a resource from pacemaker (via pcs on EL8) without stopping the resource? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Is there a DRBD forum?
On 2021-10-19 04:27, Ian Diddams via Users wrote: Rather than clog up what i perceive as a pacemaker/corosync forum is there a DRBD forum I could ask a query to? (FWIW Im trying to find a way to specifically log drbd to a separate log other thjan system log via its kernel logging ?) DRBD, along with any and all open source projects related to high availability, are welcome and on-topic. So no worries there. Separately, there is also a dedicated DRBD user's list at drbd-u...@lists.linbit.com, and they maintain a slack on "linbit-community". -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Pacemaker 2.1.1 final release now available
On 2021-09-09 7:12 p.m., Ken Gaillot wrote: Hi all, Pacemaker 2.1.1 has officially been released, with source code available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.1 Highlights include a number of regression fixes and other bug fixes. For more details, see the ChangeLog in the source repository. Many thanks to all contributors of source code to this release, including Chris Lumens, Christine Caulfield, Emil Penchev, Gao,Yan, Grace Chin, Hideo Yamauchi, José Guilherme Vanz, Ken Gaillot, Klaus Wenninger, and Oyvind Albrigtsen. Congrats to all!! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] If You Were Building a DRBD-Based Cluster Today...
On 2021-08-18 3:43 p.m., Eric Robinson wrote: If you were building a DRBD-based cluster today on servers with internal storage, what would you use for a filesystem? To be more specific, the servers have 6 x 3.2 TB NVME drives and no RAID controller. Would you build an mdraid array as your drbd backing device? Maybe a ZFS RAIDZ? How would you feel about using ZFS as the filesystem over DRBD to take advantage of filesystem compression? Or would you do something else entirely? What will the use-case be? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Pacemaker/corosync behavior in case of partial split brain
On 2021-08-05 2:25 p.m., Andrei Borzenkov wrote: Three nodes A, B, C. Communication between A and B is blocked (completely - no packet can come in both direction). A and B can communicate with C. I expected that result will be two partitions - (A, C) and (B, C). To my surprise, A went offline leaving (B, C) running. It was always the same node (with node id 1 if it matters, out of 1, 2, 3). How surviving partition is determined in this case? Can I be sure the same will also work in case of multiple nodes? I.e. if I have two sites with equal number of nodes and the third site as witness and connectivity between multi-node sites is lost but each site can communicate with witness. Will one site go offline? Which one? In your case, your nodes were otherwise healthy so quorum worked. To properly avoid a split brain (when a node is not behaving properly, ie: lockups, bad RAM/CPU, etc) you realy need actual fencing. In such a case, whichever nodes maintain quorum, will fence the lost node (be it because it became inquorate or stopped behaving properly). As for the mechanics of how quorum is determined in your case above, I'll let one of the corosync people decide. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 'pcs stonith update' takes, then reverts
On 2021-07-26 12:50 p.m., kgail...@redhat.com wrote: On Mon, 2021-07-26 at 12:25 -0400, Digimer wrote: On 2021-07-26 9:54 a.m., kgail...@redhat.com wrote: On Fri, 2021-07-23 at 21:46 -0400, Digimer wrote: After a LOT of hassle, I finally got it updated, but OMG it was painful. I degraded the cluster (unsure if needed), set maintenance mode, deleted the stonith levels, deleted the stonith devices, recreated them with the updated values, recreated the stonith levels, and finally disabled maintenance mode. It should not have been this hard, right? Why is heck would it be that pacemaker kept "rolling back" to old configs? I'd delete the stonith That is bizarre. It sounds like the CIB changes were taking effect locally, then being rejected by the rest of the cluster, which would send the "correct" CIB back to the originator. The logs of interest would be pacemaker.log from both nodes at the time you made the first configuration change that failed. I'm guessing the logs you posted were from after that point? Below are the logs. The change appears to first try at 'Jul 23 16:22:27', made on an-a02n01, included logs for a few minutes before in case relevant. * an-a02n01: https://www.alteeve.com/an-repo/files/an-a02n01.pacemaker.log * an-a02n02: https://www.alteeve.com/an-repo/files/an-a02n02.pacemaker.log Note that the PDUs as originally configured (10.201.2.1/2) were not available, so I had to disable and cleanup the stonith resources. They seemed to keep getting re-enabled, so I got to the habit of doing this cycle of disable -> cleanup -> disable -> cleanup before I could reliably get the resources to be 'stopped (disabled)' in 'pcs stonith status'. digimer The initial change happened here: Jul 23 16:22:27 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: Diff: --- 0.337.112 2 Jul 23 16:22:27 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: Diff: +++ 0.338.0 6a24af66df3d9f825cc2681222f8f5d6 Jul 23 16:22:27 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: + /cib: @epoch=338, @num_updates=0 Jul 23 16:22:27 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: + /cib/configuration/resources/primitive[@id='apc_snmp_node1_an-pdu03']/instance_attributes[@id='apc_snmp_node1_an-pdu03-instance_attributes']/nvpair[@id='apc_snmp_node1_an-pdu03-instance_attributes-ip']: @value=10.201.2.3 Jul 23 16:22:27 an-a02n01.alteeve.com pacemaker-based [121628] (cib_replace_notify) info: Replaced: 0.337.112 -> 0.338.0 from an-a02n02 Jul 23 16:22:27 an-a02n01.alteeve.com pacemaker-based [121628] (cib_process_request) info: Completed cib_replace operation for section configuration: OK (rc=0, origin=an-a02n02/cibadmin/2, version=0.338.0) origin=an-a02n02/cibadmin/2 means that someone or something ran the cibadmin tool on an-02n02. Presumably this was your interactive pcs command. It was then reverted by: Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: Diff: --- 0.343.3 2 Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: Diff: +++ 0.344.0 (null) Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: + /cib: @epoch=344, @num_updates=0 Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ /cib/configuration/resources: Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteeve.com pacemaker-based [121628] (cib_perform_op) info: ++ Jul 23 16:22:50 an-a02n01.alteev
Re: [ClusterLabs] 'pcs stonith update' takes, then reverts
On 2021-07-26 9:54 a.m., kgail...@redhat.com wrote: On Fri, 2021-07-23 at 21:46 -0400, Digimer wrote: After a LOT of hassle, I finally got it updated, but OMG it was painful. I degraded the cluster (unsure if needed), set maintenance mode, deleted the stonith levels, deleted the stonith devices, recreated them with the updated values, recreated the stonith levels, and finally disabled maintenance mode. It should not have been this hard, right? Why is heck would it be that pacemaker kept "rolling back" to old configs? I'd delete the stonith That is bizarre. It sounds like the CIB changes were taking effect locally, then being rejected by the rest of the cluster, which would send the "correct" CIB back to the originator. The logs of interest would be pacemaker.log from both nodes at the time you made the first configuration change that failed. I'm guessing the logs you posted were from after that point? Below are the logs. The change appears to first try at 'Jul 23 16:22:27', made on an-a02n01, included logs for a few minutes before in case relevant. * an-a02n01: https://www.alteeve.com/an-repo/files/an-a02n01.pacemaker.log * an-a02n02: https://www.alteeve.com/an-repo/files/an-a02n02.pacemaker.log Note that the PDUs as originally configured (10.201.2.1/2) were not available, so I had to disable and cleanup the stonith resources. They seemed to keep getting re-enabled, so I got to the habit of doing this cycle of disable -> cleanup -> disable -> cleanup before I could reliably get the resources to be 'stopped (disabled)' in 'pcs stonith status'. digimer -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 'pcs stonith update' takes, then reverts
On 2021-07-26 9:54 a.m., kgail...@redhat.com wrote: On Fri, 2021-07-23 at 21:46 -0400, Digimer wrote: After a LOT of hassle, I finally got it updated, but OMG it was painful. I degraded the cluster (unsure if needed), set maintenance mode, deleted the stonith levels, deleted the stonith devices, recreated them with the updated values, recreated the stonith levels, and finally disabled maintenance mode. It should not have been this hard, right? Why is heck would it be that pacemaker kept "rolling back" to old configs? I'd delete the stonith That is bizarre. It sounds like the CIB changes were taking effect locally, then being rejected by the rest of the cluster, which would send the "correct" CIB back to the originator. The logs of interest would be pacemaker.log from both nodes at the time you made the first configuration change that failed. I'm guessing the logs you posted were from after that point? The logs I shared started after the issue began, yes. I can see if I can access the nodes and pull the logs. Note that I degraded the cluster (withdrew the inactive node), so the node was a cluster itself alone, and and still happened. digimer -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 'pcs stonith update' takes, then reverts
After a LOT of hassle, I finally got it updated, but OMG it was painful. I degraded the cluster (unsure if needed), set maintenance mode, deleted the stonith levels, deleted the stonith devices, recreated them with the updated values, recreated the stonith levels, and finally disabled maintenance mode. It should not have been this hard, right? Why is heck would it be that pacemaker kept "rolling back" to old configs? I'd delete the stonith levels, delete one PDU stonith resource, delete a second, and suddenly all the levels were back and the resources I had just removed came back with the old configs. Certainly I was doing something wrong, but what? digimer On 2021-07-23 8:04 p.m., Digimer wrote: > Update; > > Appears I can't even delete the damn things. They re-appeared after > doing a [pcs stonith remove '!. > > Wow. > > digimer > > On 2021-07-23 7:56 p.m., Digimer wrote: >> Hi all, >> >> Got a really odd one here... >> >> I had a cluster in the lab where it was built and tested. Then we >> deployed it, and I've been trying to update the stonith config. It >> _seems_ to take at first, then it reverts back to the old config. This >> is super annoying, obviously. :) >> >> So I can confirm that the APC PDUs work; >> >> >> [root@an-a02n01 ~]# fence_apc_snmp -a 10.201.2.3 -n 3 -o status >> Status: ON >> [root@an-a02n01 ~]# fence_apc_snmp -a 10.201.2.4 -n 3 -o status >> Status: ON >> >> >> Here's the config as-shipped; >> >> >> # pcs stonith config apc_snmp_node1_an-pdu03 >> Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) >> Attributes: ip=10.201.2.1 pcmk_host_list=an-a02n01 >> pcmk_off_action=reboot port=5 >> Operations: monitor interval=60 >> (apc_snmp_node1_an-pdu03-monitor-interval-60) >> Target: an-a02n01 >> Level 1 - ipmilan_node1 >> Level 2 - apc_snmp_node1_an-pdu03,apc_snmp_node1_an-pdu04 >> Level 3 - delay_node1 >> Target: an-a02n02 >> Level 1 - ipmilan_node2 >> Level 2 - apc_snmp_node2_an-pdu03,apc_snmp_node2_an-pdu04 >> Level 3 - delay_node2 >> >> >> So in this example, I am trying to update 'apc_snmp_node1_an-pdu03' to >> change the IP from 10.201.2.1 -> 10.201.2.3 and to change the port from >> port 5 -> 3. >> >> >> # pcs stonith update apc_snmp_node1_an-pdu03 ip=10.201.2.3 >> pcmk_host_list=an-a02n01 pcmk_off_action=reboot port=3 >> # pcs stonith config apc_snmp_node1_an-pdu03 >> Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) >> Attributes: ip=10.201.2.3 pcmk_host_list=an-a02n01 >> pcmk_off_action=reboot port=3 >> >> >> As you can see, initially it appears to work. However, after a minute, >> the config reverts; >> >> >> # pcs stonith config apc_snmp_node1_an-pdu03 >> Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) >> Attributes: ip=10.201.2.1 pcmk_host_list=an-a02n01 >> pcmk_off_action=reboot port=5 >> >> >> This happens to all four stonith devices (two per node, two nodes). I've >> tried doing a 'pcs stonith disable ' for all four devices, and did a >> 'pcs stonith cleanup' to make sure errors were cleared before updating, >> still no luck. >> >> Any idea what I'm doing wrong? >> >> The logs from the node I run the update on, followed by the logs on the >> peer: >> >> digimer >> >> >> Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-attrd[121631]: notice: >> Updating all attributes after cib_refresh_notify event >> Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-controld[121633]: >> notice: State transition S_IDLE -> S_POLICY_ENGINE >> Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-controld[121633]: >> notice: State transition S_ELECTION -> S_INTEGRATION >> Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-fenced[121629]: notice: >> Added 'apc_snmp_node1_an-pdu03' to device list (5 active devices) >> Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: >> notice: Clearing failure of apc_snmp_node1_an-pdu03 on an-a02n01 because >> resource parameters have changed >> Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: >> warning: Unexpected result (error) was recorded for start of >> apc_snmp_node1_an-pdu03 on an-a02n01 at Jul 23 16:37:55 2021 >> Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: >> warning: Unexpected result (error) was
Re: [ClusterLabs] 'pcs stonith update' takes, then reverts
Update; Appears I can't even delete the damn things. They re-appeared after doing a [pcs stonith remove '!. Wow. digimer On 2021-07-23 7:56 p.m., Digimer wrote: > Hi all, > > Got a really odd one here... > > I had a cluster in the lab where it was built and tested. Then we > deployed it, and I've been trying to update the stonith config. It > _seems_ to take at first, then it reverts back to the old config. This > is super annoying, obviously. :) > > So I can confirm that the APC PDUs work; > > > [root@an-a02n01 ~]# fence_apc_snmp -a 10.201.2.3 -n 3 -o status > Status: ON > [root@an-a02n01 ~]# fence_apc_snmp -a 10.201.2.4 -n 3 -o status > Status: ON > > > Here's the config as-shipped; > > > # pcs stonith config apc_snmp_node1_an-pdu03 > Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) > Attributes: ip=10.201.2.1 pcmk_host_list=an-a02n01 > pcmk_off_action=reboot port=5 > Operations: monitor interval=60 > (apc_snmp_node1_an-pdu03-monitor-interval-60) > Target: an-a02n01 > Level 1 - ipmilan_node1 > Level 2 - apc_snmp_node1_an-pdu03,apc_snmp_node1_an-pdu04 > Level 3 - delay_node1 > Target: an-a02n02 > Level 1 - ipmilan_node2 > Level 2 - apc_snmp_node2_an-pdu03,apc_snmp_node2_an-pdu04 > Level 3 - delay_node2 > > > So in this example, I am trying to update 'apc_snmp_node1_an-pdu03' to > change the IP from 10.201.2.1 -> 10.201.2.3 and to change the port from > port 5 -> 3. > > > # pcs stonith update apc_snmp_node1_an-pdu03 ip=10.201.2.3 > pcmk_host_list=an-a02n01 pcmk_off_action=reboot port=3 > # pcs stonith config apc_snmp_node1_an-pdu03 > Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) > Attributes: ip=10.201.2.3 pcmk_host_list=an-a02n01 > pcmk_off_action=reboot port=3 > > > As you can see, initially it appears to work. However, after a minute, > the config reverts; > > > # pcs stonith config apc_snmp_node1_an-pdu03 > Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) > Attributes: ip=10.201.2.1 pcmk_host_list=an-a02n01 > pcmk_off_action=reboot port=5 > > > This happens to all four stonith devices (two per node, two nodes). I've > tried doing a 'pcs stonith disable ' for all four devices, and did a > 'pcs stonith cleanup' to make sure errors were cleared before updating, > still no luck. > > Any idea what I'm doing wrong? > > The logs from the node I run the update on, followed by the logs on the > peer: > > digimer > > > Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-attrd[121631]: notice: > Updating all attributes after cib_refresh_notify event > Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-controld[121633]: > notice: State transition S_IDLE -> S_POLICY_ENGINE > Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-controld[121633]: > notice: State transition S_ELECTION -> S_INTEGRATION > Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-fenced[121629]: notice: > Added 'apc_snmp_node1_an-pdu03' to device list (5 active devices) > Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: > notice: Clearing failure of apc_snmp_node1_an-pdu03 on an-a02n01 because > resource parameters have changed > Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: > warning: Unexpected result (error) was recorded for start of > apc_snmp_node1_an-pdu03 on an-a02n01 at Jul 23 16:37:55 2021 > Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: > warning: Unexpected result (error) was recorded for start of > apc_snmp_node2_an-pdu03 on an-a02n01 at Jul 23 16:35:04 2021 > Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: > warning: Unexpected result (error) was recorded for start of > apc_snmp_node2_an-pdu04 on an-a02n01 at Jul 23 16:35:04 2021 > Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: > warning: Unexpected result (error) was recorded for start of > apc_snmp_node1_an-pdu04 on an-a02n02 at Jul 23 16:34:50 2021 > Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: > notice: Clearing failure of apc_snmp_node1_an-pdu03 on an-a02n02 because > resource parameters have changed > Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: > warning: Unexpected result (error) was recorded for start of > apc_snmp_node1_an-pdu03 on an-a02n02 at Jul 23 16:37:42 2021 > Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: > warning: Unexpected result (error) was recorded for start of > apc_snmp_node2_an-pdu03 on an-a02n02 at Jul 23 16:34:50 2021 > Jul 23 16:44:47 a
[ClusterLabs] 'pcs stonith update' takes, then reverts
Hi all, Got a really odd one here... I had a cluster in the lab where it was built and tested. Then we deployed it, and I've been trying to update the stonith config. It _seems_ to take at first, then it reverts back to the old config. This is super annoying, obviously. :) So I can confirm that the APC PDUs work; [root@an-a02n01 ~]# fence_apc_snmp -a 10.201.2.3 -n 3 -o status Status: ON [root@an-a02n01 ~]# fence_apc_snmp -a 10.201.2.4 -n 3 -o status Status: ON Here's the config as-shipped; # pcs stonith config apc_snmp_node1_an-pdu03 Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) Attributes: ip=10.201.2.1 pcmk_host_list=an-a02n01 pcmk_off_action=reboot port=5 Operations: monitor interval=60 (apc_snmp_node1_an-pdu03-monitor-interval-60) Target: an-a02n01 Level 1 - ipmilan_node1 Level 2 - apc_snmp_node1_an-pdu03,apc_snmp_node1_an-pdu04 Level 3 - delay_node1 Target: an-a02n02 Level 1 - ipmilan_node2 Level 2 - apc_snmp_node2_an-pdu03,apc_snmp_node2_an-pdu04 Level 3 - delay_node2 So in this example, I am trying to update 'apc_snmp_node1_an-pdu03' to change the IP from 10.201.2.1 -> 10.201.2.3 and to change the port from port 5 -> 3. # pcs stonith update apc_snmp_node1_an-pdu03 ip=10.201.2.3 pcmk_host_list=an-a02n01 pcmk_off_action=reboot port=3 # pcs stonith config apc_snmp_node1_an-pdu03 Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) Attributes: ip=10.201.2.3 pcmk_host_list=an-a02n01 pcmk_off_action=reboot port=3 As you can see, initially it appears to work. However, after a minute, the config reverts; # pcs stonith config apc_snmp_node1_an-pdu03 Resource: apc_snmp_node1_an-pdu03 (class=stonith type=fence_apc_snmp) Attributes: ip=10.201.2.1 pcmk_host_list=an-a02n01 pcmk_off_action=reboot port=5 This happens to all four stonith devices (two per node, two nodes). I've tried doing a 'pcs stonith disable ' for all four devices, and did a 'pcs stonith cleanup' to make sure errors were cleared before updating, still no luck. Any idea what I'm doing wrong? The logs from the node I run the update on, followed by the logs on the peer: digimer Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-attrd[121631]: notice: Updating all attributes after cib_refresh_notify event Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-controld[121633]: notice: State transition S_IDLE -> S_POLICY_ENGINE Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-controld[121633]: notice: State transition S_ELECTION -> S_INTEGRATION Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-fenced[121629]: notice: Added 'apc_snmp_node1_an-pdu03' to device list (5 active devices) Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: notice: Clearing failure of apc_snmp_node1_an-pdu03 on an-a02n01 because resource parameters have changed Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Unexpected result (error) was recorded for start of apc_snmp_node1_an-pdu03 on an-a02n01 at Jul 23 16:37:55 2021 Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Unexpected result (error) was recorded for start of apc_snmp_node2_an-pdu03 on an-a02n01 at Jul 23 16:35:04 2021 Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Unexpected result (error) was recorded for start of apc_snmp_node2_an-pdu04 on an-a02n01 at Jul 23 16:35:04 2021 Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Unexpected result (error) was recorded for start of apc_snmp_node1_an-pdu04 on an-a02n02 at Jul 23 16:34:50 2021 Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: notice: Clearing failure of apc_snmp_node1_an-pdu03 on an-a02n02 because resource parameters have changed Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Unexpected result (error) was recorded for start of apc_snmp_node1_an-pdu03 on an-a02n02 at Jul 23 16:37:42 2021 Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Unexpected result (error) was recorded for start of apc_snmp_node2_an-pdu03 on an-a02n02 at Jul 23 16:34:50 2021 Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Unexpected result (error) was recorded for start of apc_snmp_node2_an-pdu04 on an-a02n02 at Jul 23 16:34:50 2021 Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Forcing apc_snmp_node2_an-pdu03 away from an-a02n01 after 100 failures (max=100) Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Forcing apc_snmp_node2_an-pdu04 away from an-a02n01 after 100 failures (max=100) Jul 23 16:44:47 an-a02n01.alteeve.com pacemaker-schedulerd[121632]: warning: Forcing apc_snmp_node1_an-pdu04 away from an-a02n02 after 100 failures (max=100) Jul 23 16:44:
Re: [ClusterLabs] pcs stonith update problems
On 2021-07-21 8:19 a.m., Tomas Jelinek wrote: > Dne 16. 07. 21 v 16:30 Digimer napsal(a): >> On 2021-07-16 9:26 a.m., Tomas Jelinek wrote: >>> Dne 16. 07. 21 v 6:35 Andrei Borzenkov napsal(a): >>>> On 16.07.2021 01:02, Digimer wrote: >>>>> Hi all, >>>>> >>>>> I've got a predicament... I want to update a stonith resource to >>>>> remove an argument. Specifically, when resource move nodes, I want to >>>>> change the stonith delay to favour the new host. This involves adding >>>>> the 'delay="x"' argument to one stonith resource, and removing it from >>>>> the other; >>>>> >>>>> Example; >>>>> >>>>> >>>>> # pcs cluster cib | grep -B7 -A7 '"delay"' >>>>> >>>> type="fence_ipmilan"> >>>>> >>>>> >>>> name="ipaddr" value="10.201.17.1"/> >>>>> >>>> name="password" value="xxx"/> >>>>> >>>> id="ipmilan_node1-instance_attributes-pcmk_host_list" >>>>> name="pcmk_host_list" value="an-a02n01"/> >>>>> >>>> name="username" value="admin"/> >>>>> >>>> name="delay" value="15"/> >>>>> >>>>> >>>>> >>>> name="monitor"/> >>>>> >>>>> >>>>> >>>>> >>>>> Here, the stonith resource 'ipmilan_node1' has the delay="15". >>>>> >>>>> If I run: >>>>> >>>>> >>>>> # pcs stonith update ipmilan_node1 fence_ipmilan ipaddr="10.201.17.1" >>>>> password="xxx" username="admin"; echo $? >>>>> 0 >>>>> >>>>> >>>>> I see nothing happen in journald, and the delay argument remains in >>>>> the >>>>> 'pcs cluster cib' output. If, however, I do; >>>>> >>>>> >>>>> # /usr/sbin/pcs stonith update ipmilan_node1 fence_ipmilan >>>>> ipaddr="10.201.17.1" password="xxx" username="admin" delay="0"; >>>>> echo $? >>>>> 0 >>>>> >>>>> >>>>> I can see in journald that the CIB was updated and can confirm in 'pcs >>>>> cluster cib' that the 'delay' value becomes '0'. So it seems that, >>>>> if an >>>>> argument previously existed and is NOT specified in an update, it >>>>> is not >>>>> removed. >>>>> >>>>> Is this intentional for some reason? If so, how would I remove the >>>>> delay >>>>> attribute? >>> >>> Yes, this is intentional. As far as I remember, update commands in pcs >>> have always worked this way: >>> * do not change attributes not specified in the command >>> * if an attribute is specified with an empty value, remove the attribute >>> from cluster configuration >>> * else set specified value of the specified attribute in cluster >>> configuration >>> >>> This means you only need to specify attributes you want to change. You >>> don't need to bother with attributes you want to keep unchanged. >>> >>> If you want to delete the delay attribute, you can do it like this: >>> pcs stonith update ipmilan_node1 delay= >>> This will remove delay and keep all the other attributes unchanged. >>> >>> I'm not sure why this principle is not documented in pcs man page. We >>> can fix that, though. >>> >>> Note that specifying a stonith agent in the update command does nothing >>> (which is expected) and is silently ignored by pcs (which is a bug). >>> >>> Regards, >>> Tomas >> >> Ah, thank you (and as Andrei)! >> >> Is this behaviour not documented in 'pcs stonith help' -> update? Or was >> I blind and missed it? >> > > I think this isn't documented anywhere in pcs. It may be described in > Clusters from Scratch or a similar document. I think I've read it > somewhere, but I'm not sure where it was. > > Anyway, I put it on our todo list to get it explained in pcs help and > man page. > > > Thanks, > Tomas Thanks! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Two node cluster without fencing and no split brain?
On 2021-07-21 3:26 a.m., Jehan-Guillaume de Rorthais wrote: > Hi, > > On Wed, 21 Jul 2021 04:28:30 + (UTC) > Strahil Nikolov via Users wrote: > >> Hi, >> consider using a 3rd system as a Q disk. Also, you can use iscsi from that >> node as a SBD device, so you will have proper fencing .If you don't have a >> hardware watchdog device, you can use softdog kernel module for that. Best > > Having 3 nodes for quorum AND watchdog (using softdog in last resort) is > enough, > isn't it? > But yes, having a shared storage to add a SBD device is even better. > > Regards, The third node with storage-based death is a way of creating a fence configuration. It works because it's fencing, not because it's quorum. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Two node cluster without fencing and no split brain?
On 2021-07-20 6:04 p.m., john tillman wrote: > Greetings, > > Is it possible to configure a two node cluster (pacemaker 2.0) without > fencing and avoid split brain? No. > I was hoping there was a way to use a 3rd node's ip address, like from a > network switch, as a tie breaker to provide quorum. A simple successful > ping would do it. Quorum is a different concept and doesn't remove the need for fencing. > I realize that this 'ping' approach is not the bullet proof solution that > fencing would provide. However, it may be an improvement over two nodes > alone. It would be, at best, a false sense of security. > Is there a configuration like that already? Any other ideas? > > Pointers to useful documents/discussions on avoiding split brain with two > node clusters would be welcome. https://www.alteeve.com/w/The_2-Node_Myth (note: currently throwing a cert error related to the let's encrypt issue, should be cleared up soon). -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] pcs stonith update problems
On 2021-07-16 10:48 a.m., kgail...@redhat.com wrote: > On Thu, 2021-07-15 at 18:02 -0400, Digimer wrote: >> Hi all, >> >> I've got a predicament... I want to update a stonith resource to >> remove an argument. Specifically, when resource move nodes, I want to >> change the stonith delay to favour the new host. This involves adding >> the 'delay="x"' argument to one stonith resource, and removing it >> from >> the other; > > There are two better ways to do what you want that don't involve > changing the configuration every time. > > (1) Priority fencing delay: > > https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#cluster-options > > This was added in 2.0.4, and does exactly what you want. Just set the > priority meta-attribute on your resources, and the priority-fencing- > delay cluster property. > > (1) Attribute-based rules: > > https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#document-rules > > If you don't have 2.0.4, you can use rules instead. pcs might not > support it, though, so you might need to configure the XML. > > * Configure an ocf:pacemaker:attribute resource that's colocated with > your main resource (this just sets a node attribute wherever it's > active) > > * Configure the stonith delay in a rule-based attribute block, with a 0 > delay if the node attribute is "active" and a higher delay if it's > "inactive" Welp, this could have saved me a LOT of time, had I known. Haha! Thanks! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] pcs stonith update problems
On 2021-07-16 9:26 a.m., Tomas Jelinek wrote: > Dne 16. 07. 21 v 6:35 Andrei Borzenkov napsal(a): >> On 16.07.2021 01:02, Digimer wrote: >>> Hi all, >>> >>> I've got a predicament... I want to update a stonith resource to >>> remove an argument. Specifically, when resource move nodes, I want to >>> change the stonith delay to favour the new host. This involves adding >>> the 'delay="x"' argument to one stonith resource, and removing it from >>> the other; >>> >>> Example; >>> >>> >>> # pcs cluster cib | grep -B7 -A7 '"delay"' >>> >> type="fence_ipmilan"> >>> >>> >> name="ipaddr" value="10.201.17.1"/> >>> >> name="password" value="xxx"/> >>> >> name="pcmk_host_list" value="an-a02n01"/> >>> >> name="username" value="admin"/> >>> >> name="delay" value="15"/> >>> >>> >>> >> name="monitor"/> >>> >>> >>> >>> >>> Here, the stonith resource 'ipmilan_node1' has the delay="15". >>> >>> If I run: >>> >>> >>> # pcs stonith update ipmilan_node1 fence_ipmilan ipaddr="10.201.17.1" >>> password="xxx" username="admin"; echo $? >>> 0 >>> >>> >>> I see nothing happen in journald, and the delay argument remains in the >>> 'pcs cluster cib' output. If, however, I do; >>> >>> >>> # /usr/sbin/pcs stonith update ipmilan_node1 fence_ipmilan >>> ipaddr="10.201.17.1" password="xxx" username="admin" delay="0"; echo $? >>> 0 >>> >>> >>> I can see in journald that the CIB was updated and can confirm in 'pcs >>> cluster cib' that the 'delay' value becomes '0'. So it seems that, if an >>> argument previously existed and is NOT specified in an update, it is not >>> removed. >>> >>> Is this intentional for some reason? If so, how would I remove the delay >>> attribute? > > Yes, this is intentional. As far as I remember, update commands in pcs > have always worked this way: > * do not change attributes not specified in the command > * if an attribute is specified with an empty value, remove the attribute > from cluster configuration > * else set specified value of the specified attribute in cluster > configuration > > This means you only need to specify attributes you want to change. You > don't need to bother with attributes you want to keep unchanged. > > If you want to delete the delay attribute, you can do it like this: > pcs stonith update ipmilan_node1 delay= > This will remove delay and keep all the other attributes unchanged. > > I'm not sure why this principle is not documented in pcs man page. We > can fix that, though. > > Note that specifying a stonith agent in the update command does nothing > (which is expected) and is silently ignored by pcs (which is a bug). > > Regards, > Tomas Ah, thank you (and as Andrei)! Is this behaviour not documented in 'pcs stonith help' -> update? Or was I blind and missed it? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] pcs stonith update problems
Hi all, I've got a predicament... I want to update a stonith resource to remove an argument. Specifically, when resource move nodes, I want to change the stonith delay to favour the new host. This involves adding the 'delay="x"' argument to one stonith resource, and removing it from the other; Example; # pcs cluster cib | grep -B7 -A7 '"delay"' Here, the stonith resource 'ipmilan_node1' has the delay="15". If I run: # pcs stonith update ipmilan_node1 fence_ipmilan ipaddr="10.201.17.1" password="xxx" username="admin"; echo $? 0 I see nothing happen in journald, and the delay argument remains in the 'pcs cluster cib' output. If, however, I do; # /usr/sbin/pcs stonith update ipmilan_node1 fence_ipmilan ipaddr="10.201.17.1" password="xxx" username="admin" delay="0"; echo $? 0 I can see in journald that the CIB was updated and can confirm in 'pcs cluster cib' that the 'delay' value becomes '0'. So it seems that, if an argument previously existed and is NOT specified in an update, it is not removed. Is this intentional for some reason? If so, how would I remove the delay attribute? I've got a fairly complex stonith config, with stonith levels. Deleting and recreating the config would be non-trivial. Pacemaker v2.1.0, pcs v0.10.8.181-47e9, CentOS Stream 8. digimer -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Pacemaker 2.1.0 final release now available
On 2021-06-08 5:24 p.m., kgail...@redhat.com wrote: > Hi all, > > Pacemaker 2.1.0 has officially been released, with source code > available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.0 > > Highlights include OCF Resource Agent API 1.1 compatibility, > noncritical resources, and new build-time options. The Pacemaker > documentation is now built using Sphinx instead of Publican, giving a > fresher look: > > https://clusterlabs.org/pacemaker/doc/ > > For more details, see the ChangeLog in the source repository, and the > following wiki page, which distribution packagers and users who build > Pacemaker from source or use Pacemaker command-line tools in scripts > are encouraged to go over carefully: > > https://wiki.clusterlabs.org/wiki/Pacemaker_2.1_Changes Huge congrats!! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Cluster Stopped, No Messages?
On 2021-05-28 3:08 p.m., Eric Robinson wrote: > >> -Original Message- >> From: Digimer >> Sent: Friday, May 28, 2021 12:43 PM >> To: Cluster Labs - All topics related to open-source clustering welcomed >> ; Eric Robinson ; Strahil >> Nikolov >> Subject: Re: [ClusterLabs] Cluster Stopped, No Messages? >> >> Shared storage is not what triggers the need for fencing. Coordinating >> actions >> is what triggers the need. Specifically; If you can run resource on both/all >> nodes at the same time, you don't need HA. If you can't, you need fencing. >> >> Digimer > > Thanks. That said, there is no fencing, so any thoughts on why the node > behaved the way it did? Without fencing, when a communication or membership issues arises, it's hard to predict what will happen. I don't see anything in the short log snippet to indicate what happened. What's on the other node during the event? When did the node disappear and when was it rejoined, to help find relevant log entries? Going forward, if you want predictable and reliable operation, implement fencing asap. Fencing is required. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Cluster Stopped, No Messages?
Shared storage is not what triggers the need for fencing. Coordinating actions is what triggers the need. Specifically; If you can run resource on both/all nodes at the same time, you don't need HA. If you can't, you need fencing. digimer On 2021-05-28 1:19 p.m., Eric Robinson wrote: > There is no fencing agent on this cluster and no shared storage. > > -Eric > > *From:* Strahil Nikolov > *Sent:* Friday, May 28, 2021 10:08 AM > *To:* Cluster Labs - All topics related to open-source clustering > welcomed ; Eric Robinson > *Subject:* Re: [ClusterLabs] Cluster Stopped, No Messages? > > what is your fencing agent ? > > Best Regards, > > Strahil Nikolov > > On Thu, May 27, 2021 at 20:52, Eric Robinson > > mailto:eric.robin...@psmnv.com>> wrote: > > We found one of our cluster nodes down this morning. The server was > up but cluster services were not running. Upon examination of the > logs, we found that the cluster just stopped around 9:40:31 and then > I started it up manually (pcs cluster start) at 11:49:48. I can’t > imagine that Pacemaker just randomly terminates. Any thoughts why it > would behave this way? > > > > > > May 27 09:25:31 [92170] 001store01a pengine: notice: > process_pe_message: Calculated transition 91482, saving inputs in > /var/lib/pacemaker/pengine/pe-input-756.bz2 > > May 27 09:25:31 [92171] 001store01a crmd: info: > do_state_transition: State transition S_POLICY_ENGINE -> > S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE > origin=handle_response > > May 27 09:25:31 [92171] 001store01a crmd: info: > do_te_invoke: Processing graph 91482 > (ref=pe_calc-dc-1622121931-124396) derived from > /var/lib/pacemaker/pengine/pe-input-756.bz2 > > May 27 09:25:31 [92171] 001store01a crmd: notice: > run_graph: Transition 91482 (Complete=0, Pending=0, Fired=0, > Skipped=0, Incomplete=0, > Source=/var/lib/pacemaker/pengine/pe-input-756.bz2): Complete > > May 27 09:25:31 [92171] 001store01a crmd: info: > do_log: Input I_TE_SUCCESS received in state > S_TRANSITION_ENGINE from notify_crmd > > May 27 09:25:31 [92171] 001store01a crmd: notice: > do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE > | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd > > May 27 09:40:31 [92171] 001store01a crmd: info: > crm_timer_popped: PEngine Recheck Timer (I_PE_CALC) just popped > (90ms) > > May 27 09:40:31 [92171] 001store01a crmd: notice: > do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE | > input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped > > May 27 09:40:31 [92171] 001store01a crmd: info: > do_state_transition: Progressed to state S_POLICY_ENGINE after > C_TIMER_POPPED > > May 27 09:40:31 [92170] 001store01a pengine: info: > process_pe_message: Input has not changed since last time, not > saving to disk > > May 27 09:40:31 [92170] 001store01a pengine: info: > determine_online_status: Node 001store01a is online > > May 27 09:40:31 [92170] 001store01a pengine: info: > determine_op_status: Operation monitor found resource > p_pure-ftpd-itls active on 001store01a > > May 27 09:40:31 [92170] 001store01a pengine: warning: > unpack_rsc_op_failure: Processing failed op monitor for > p_vip_ftpclust01 on 001store01a: unknown error (1) > > May 27 09:40:31 [92170] 001store01a pengine: info: > determine_op_status: Operation monitor found resource > p_pure-ftpd-etls active on 001store01a > > May 27 09:40:31 [92170] 001store01a pengine: info: > unpack_node_loop: Node 1 is already processed > > May 27 09:40:31 [92170] 001store01a pengine: info: > unpack_node_loop: Node 1 is already processed > > May 27 09:40:31 [92170] 001store01a pengine: info: > common_print: p_vip_ftpclust01 > (ocf::heartbeat:IPaddr2): Started 001store01a > > May 27 09:40:31 [92170] 001store01a pengine: info: > common_print: p_replicator (systemd:pure-replicator): > Started 001store01a > > May 27 09:40:31 [92170] 001store01a pengine: info: > common_print: p_pure-ftpd-etls > (systemd:pure-ftpd-etls): Started 001store01a > > May 27 09:40:31 [92170] 001store01a pengine: info: > common_print: p_pure-ftpd-itls >
Re: [ClusterLabs] #clusterlabs IRC channel
On 2021-05-19 1:10 p.m., Digimer wrote: > On 2021-05-19 12:58 p.m., Digimer wrote: >> On 2021-05-19 12:55 p.m., kgail...@redhat.com wrote: >>> Hello all, >>> >>> The ClusterLabs community has long used a #clusterlabs IRC channel on >>> the popular IRC server freenode.net. >>> >>> As you may have heard, freenode recently had a mass exodus of staff and >>> channels after a corporate buy-out that was perceived as threatening >>> the user community's values. >>> >>> Many have moved to a new server, libera.chat, organized as a nonprofit. >>> We have grabbed the #clusterlabs channel there to reserve the name. >>> (Thanks, digimer!) >>> >>> Our options are to move #clusterlabs to libera.chat, find another >>> public server, set up our own chat solution on clusterlabs.org, or >>> continue using freenode until the dust settles. >>> >>> By coincidence, I've been investigating Phacility as an open-source >>> project management tool for Pacemaker. I was already planning to set up >>> a demo instance on clusterlabs.org Friday. Phacility has a suite of >>> tools, including (web-based) chat, so I'll enable that, and we can >>> experiment with it to see if it's worth considering. >>> >>> Opinions welcome. I'll follow up on this thread with any new >>> developments. >> >> Speaking to another dev, he suggested OFTC, who have been around for 20 >> years, host other big open source projects, and has no connection to the >> freenode drama. For all these reasons, I'd like to change my suggestion >> to OFTC. As with libera.chat, I've setup and registered the channels the >> community use there as well. >> >> So as of right now, I've got things setup on both networks, and will go >> with whatever the majority chooses. For me personally, my votes are now >> +1 to oftc and -1 to libera.chat. > > I saw that #centos is moving to libera.chat, so I asked what helped them > make their decision; > > > On Wed, May 19, 2021 at 01:01:57PM -0400, Digimer wrote: >> >> Has the CentOS community considered OFTC? If so, I'm curious what made >> libera.chat win out. I have no horse in the race of either, and ask to >> see if I'm missing arguments for/against OFTC or libera. > > It's a different culture and oftc is unable to scale the way that the > sponsor model of libera allows. It's also _all_ the same staff from > freenode as of this morning. Same operational mode. Same commands. > Same culture. Same trust established with working with the staffers > over many years. > > It appears that Fedora is moving to libera as well: > > https://pagure.io/Fedora-Council/tickets/issue/371 > > > I expect some of the other Red Hat-based project channels > (#anaconda, for instance) will also move but this is merely a hunch. > > > Argument(s) for Libera Looks like most channels are going to Libera, so I'm modifying my vote back to +1 for libera and -1 to OFTC. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] #clusterlabs IRC channel
On 2021-05-19 12:58 p.m., Digimer wrote: > On 2021-05-19 12:55 p.m., kgail...@redhat.com wrote: >> Hello all, >> >> The ClusterLabs community has long used a #clusterlabs IRC channel on >> the popular IRC server freenode.net. >> >> As you may have heard, freenode recently had a mass exodus of staff and >> channels after a corporate buy-out that was perceived as threatening >> the user community's values. >> >> Many have moved to a new server, libera.chat, organized as a nonprofit. >> We have grabbed the #clusterlabs channel there to reserve the name. >> (Thanks, digimer!) >> >> Our options are to move #clusterlabs to libera.chat, find another >> public server, set up our own chat solution on clusterlabs.org, or >> continue using freenode until the dust settles. >> >> By coincidence, I've been investigating Phacility as an open-source >> project management tool for Pacemaker. I was already planning to set up >> a demo instance on clusterlabs.org Friday. Phacility has a suite of >> tools, including (web-based) chat, so I'll enable that, and we can >> experiment with it to see if it's worth considering. >> >> Opinions welcome. I'll follow up on this thread with any new >> developments. > > Speaking to another dev, he suggested OFTC, who have been around for 20 > years, host other big open source projects, and has no connection to the > freenode drama. For all these reasons, I'd like to change my suggestion > to OFTC. As with libera.chat, I've setup and registered the channels the > community use there as well. > > So as of right now, I've got things setup on both networks, and will go > with whatever the majority chooses. For me personally, my votes are now > +1 to oftc and -1 to libera.chat. I saw that #centos is moving to libera.chat, so I asked what helped them make their decision; On Wed, May 19, 2021 at 01:01:57PM -0400, Digimer wrote: > > Has the CentOS community considered OFTC? If so, I'm curious what made > libera.chat win out. I have no horse in the race of either, and ask to > see if I'm missing arguments for/against OFTC or libera. It's a different culture and oftc is unable to scale the way that the sponsor model of libera allows. It's also _all_ the same staff from freenode as of this morning. Same operational mode. Same commands. Same culture. Same trust established with working with the staffers over many years. It appears that Fedora is moving to libera as well: https://pagure.io/Fedora-Council/tickets/issue/371 I expect some of the other Red Hat-based project channels (#anaconda, for instance) will also move but this is merely a hunch. Argument(s) for Libera -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] #clusterlabs IRC channel
On 2021-05-19 12:55 p.m., kgail...@redhat.com wrote: > Hello all, > > The ClusterLabs community has long used a #clusterlabs IRC channel on > the popular IRC server freenode.net. > > As you may have heard, freenode recently had a mass exodus of staff and > channels after a corporate buy-out that was perceived as threatening > the user community's values. > > Many have moved to a new server, libera.chat, organized as a nonprofit. > We have grabbed the #clusterlabs channel there to reserve the name. > (Thanks, digimer!) > > Our options are to move #clusterlabs to libera.chat, find another > public server, set up our own chat solution on clusterlabs.org, or > continue using freenode until the dust settles. > > By coincidence, I've been investigating Phacility as an open-source > project management tool for Pacemaker. I was already planning to set up > a demo instance on clusterlabs.org Friday. Phacility has a suite of > tools, including (web-based) chat, so I'll enable that, and we can > experiment with it to see if it's worth considering. > > Opinions welcome. I'll follow up on this thread with any new > developments. Speaking to another dev, he suggested OFTC, who have been around for 20 years, host other big open source projects, and has no connection to the freenode drama. For all these reasons, I'd like to change my suggestion to OFTC. As with libera.chat, I've setup and registered the channels the community use there as well. So as of right now, I've got things setup on both networks, and will go with whatever the majority chooses. For me personally, my votes are now +1 to oftc and -1 to libera.chat. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 32 nodes pacemaker cluster setup issue
On 2021-05-18 1:13 p.m., S Sathish S wrote: > Hi Digimer/Team, > > > > In our product use unicast protocols and CPU load is normal while > problematic timing. > > > > We don’t defined corosync / totem timing values using default timing > till now, Please suggest what basic we need to tune this parameter based > on cluster nodes ? > > > > [root@node1 ~]# corosync-cmapctl | grep totem.token > > runtime.config.totem.token (u32) = 19850 > > runtime.config.totem.token_retransmit (u32) = 4726 > > runtime.config.totem.token_retransmits_before_loss_const (u32) = 4 I have no experience with such large clusters, so I can't offer direct advice. Perhaps this thread is helpful? https://lists.clusterlabs.org/pipermail/users/2016-August/010999.html You might also find useful advice in 'man 5 corosync.conf'. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 32 nodes pacemaker cluster setup issue
[TOTEM ] A new membership > (10.217.41.26:104104) was formed. Members joined: 27 29 18 > > May 18 16:22:20 [1968] node2 corosync notice [QUORUM] Members[4]: 27 29 > 32 18 > > May 18 16:22:20 [1968] node2 corosync notice [MAIN ] Completed service > synchronization, ready to provide service. > > May 18 16:22:45 [1968] node2 corosync notice [TOTEM ] A new membership > (10.217.41.26:104112) was formed. Members > > May 18 16:22:45 [1968] node2 corosync notice [QUORUM] Members[4]: 27 29 > 32 18 > > May 18 16:22:45 [1968] node2 corosync notice [MAIN ] Completed service > synchronization, ready to provide service. > > May 18 16:22:46 [1968] node2 corosync notice [TOTEM ] A new membership > (10.217.41.26:104116) was formed. Members joined: 30 > > May 18 16:22:46 [1968] node2 corosync notice [QUORUM] Members[5]: 27 29 > 30 32 18 > > > > *Any PCS command will fail with error message on all nodes:* > > [root@node1 online]# pcs property set maintenance-mode=false --wait=240 > Error: Unable to update cib > Call cib_replace failed (-62): Timer expired > > [root@node1 online]# > > > > *Workaround *: we poweroff all nodes and bring nodes one-by-one to > overcome above problem statement , kindly check on this error message > and provide us RCA for this problem. > > > > > > *Current Pacemaker version* : > > pacemaker-2.0.2 --> > https://github.com/ClusterLabs/pacemaker/tree/Pacemaker-2.0.2 > <https://github.com/ClusterLabs/pacemaker/tree/Pacemaker-2.0.2> > > corosync-2.4.4 --> https://github.com/corosync/corosync/tree/v2.4.4 > <https://github.com/corosync/corosync/tree/v2.4.4> > > pcs-0.9.169 > > > > Thanks and Regards, > > S Sathish S As I understand it, clusters over 16 nodes are generally discouraged. When you do build large clusters, the time needed to sync the CIB (and handle other messaging) can become too lengthy. Have you played with corosync / totem timing values? May need to increase them. Are you using unicast or multicast? What is the CPU load like on the nodes when these issues arise? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Problem with the cluster becoming mostly unresponsive
On 2021-05-14 6:06 p.m., kgail...@redhat.com wrote: > On Fri, 2021-05-14 at 15:04 -0400, Digimer wrote: >> Hi all, >> >> I'm run into an issue a couple of times now, and I'm not really >> sure >> what's causing it. I've got a RHEL 8 cluster that, after a while, >> will >> show one or more resources as 'FAILED'. When I try to do a cleanup, >> it >> marks the resources as stopped, despite them still running. After >> that, >> all attempts to manage the resources cause no change. The pcs command >> seems to have no effect, and in some cases refuses to return. >> >> The logs from the nodes (filtered for 'pcs' and 'pacem' since boot) >> are >> here (resources running on node 2): >> >> - >> https://www.alteeve.com/files/an-a02n01.pacemaker_hang.2021-05-14.txt > > The SNMP fence agent fails to start: > > May 12 23:29:25 an-a02n01.alteeve.com pacemaker-fenced[5947]: warning: > fence_apc_snmp[12842] stderr: [ ] > May 12 23:29:25 an-a02n01.alteeve.com pacemaker-fenced[5947]: warning: > fence_apc_snmp[12842] stderr: [ ] > May 12 23:29:25 an-a02n01.alteeve.com pacemaker-fenced[5947]: warning: > fence_apc_snmp[12842] stderr: [ 2021-05-12 23:29:25,955 ERROR: Please use > '-h' for usage ] > May 12 23:29:25 an-a02n01.alteeve.com pacemaker-fenced[5947]: warning: > fence_apc_snmp[12842] stderr: [ ] > May 12 23:29:25 an-a02n01.alteeve.com pacemaker-fenced[5947]: notice: > Operation 'monitor' [12842] for device 'apc_snmp_node2_an-pdu02' returned: > -201 (Generic Pacemaker error) > May 12 23:29:25 an-a02n01.alteeve.com pacemaker-controld[5951]: notice: > Result of start operation for apc_snmp_node2_an-pdu02 on an-a02n01: error I noticed this, but I have no idea why it would have failed... The 'fence_apc_snmp' is the bog-standard fence agent... > which is fatal (because start-failure-is-fatal=true): > > May 12 23:29:26 an-a02n01.alteeve.com pacemaker-attrd[5949]: notice: Setting > fail-count-apc_snmp_node2_an-pdu01#start_0[an-a02n02]: (unset) -> INFINITY > May 12 23:29:26 an-a02n01.alteeve.com pacemaker-attrd[5949]: notice: Setting > last-failure-apc_snmp_node2_an-pdu01#start_0[an-a02n02]: (unset) -> 1620876566 > > That happens for both devices on both nodes, so they get stopped > (successfully), which effectively disables them from being used, though > I don't see them needed in these logs so it wouldn't matter. So a monitor failure on the fence agent rendered the cluster effectively unresponsive? How would I normally recover from this? > It looks like you did a cleanup here: > > May 14 14:19:30 an-a02n01.alteeve.com pacemaker-controld[5951]: notice: > Forcing the status of all resources to be redetected > > It's hard to tell what happened after that without the detail log > (/var/log/pacemaker/pacemaker.log). The resource history should have > been wiped from the CIB, and probes of everything should have been > scheduled and executed. But I don't see any scheduler output, which is > odd. Next time I start the cluster, I will truncate the pacemaker log. Then if/when it fails again (seems to be happening regularly) I'll provide the pacemaker.log file. > Then we get a shutdown request, but the node has already left without > getting the OK to do so: > > May 14 14:22:58 an-a02n01.alteeve.com pacemaker-attrd[5949]: notice: Setting > shutdown[an-a02n02]: (unset) -> 1621016578 > May 14 14:42:58 an-a02n01.alteeve.com pacemaker-controld[5951]: warning: > Stonith/shutdown of node an-a02n02 was not expected > May 14 14:42:58 an-a02n01.alteeve.com pacemaker-attrd[5949]: notice: Node > an-a02n02 state is now lost > > The log ends there so I'm not sure what happens after that. I'd expect > this node to want to fence the other one. Since the fence devices are > failed, that can't happen, so that could be why the node is unable to > shut down itself. > >> - >> https://www.alteeve.com/files/an-a02n02.pacemaker_hang.2021-05-14.txt >> >> For example, it took 20 minutes for the 'pcs cluster stop' to >> complete. (Note that I tried restarting the pcsd daemon while >> waiting) >> >> BTW, I see the errors about fence_delay metadata, that will be >> fixed >> and I don't believe it's related. >> >> Any advice on what happened, how to avoid it, and how to clean up >> without a full cluster restart, should it happen again? >> > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Problem with the cluster becoming mostly unresponsive
Hi all, I'm run into an issue a couple of times now, and I'm not really sure what's causing it. I've got a RHEL 8 cluster that, after a while, will show one or more resources as 'FAILED'. When I try to do a cleanup, it marks the resources as stopped, despite them still running. After that, all attempts to manage the resources cause no change. The pcs command seems to have no effect, and in some cases refuses to return. The logs from the nodes (filtered for 'pcs' and 'pacem' since boot) are here (resources running on node 2): - https://www.alteeve.com/files/an-a02n01.pacemaker_hang.2021-05-14.txt - https://www.alteeve.com/files/an-a02n02.pacemaker_hang.2021-05-14.txt For example, it took 20 minutes for the 'pcs cluster stop' to complete. (Note that I tried restarting the pcsd daemon while waiting) BTW, I see the errors about fence_delay metadata, that will be fixed and I don't believe it's related. Any advice on what happened, how to avoid it, and how to clean up without a full cluster restart, should it happen again? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?
gfiles-mysql-server-mandatory) > Colocation Constraints: > fs_logfiles with drbd_logsfiles-clone (score:INFINITY) > (with-rsc-role:Master) > (id:colocation-fs_logfiles-drbd_logsfiles-clone-INFINITY) > fs_database with database_drbd-clone (score:INFINITY) > (with-rsc-role:Master) > (id:colocation-fs_database-database_drbd-clone-INFINITY) > drbd_logsfiles-clone with database_drbd-clone (score:INFINITY) > (rsc-role:Master) (with-rsc-role:Master) > (id:colocation-drbd_logsfiles-clone-database_drbd-clone-INFINITY) > HA_IP with database_drbd-clone (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) (id:colocation-HA_IP-database_drbd-clone-INFINITY) > mysql-server with fs_database (score:INFINITY) > (id:colocation-mysql-server-fs_database-INFINITY) > httpd_srv with mysql-server (score:INFINITY) > (id:colocation-httpd_srv-mysql-server-INFINITY) > Ticket Constraints: > > Alerts: > No alerts defined > > Resources Defaults: > No defaults set > Operations Defaults: > No defaults set > > Cluster Properties: > cluster-infrastructure: corosync > cluster-name: mysql_cluster > dc-version: 2.0.4-6.el8_3.1-2deceaa3ae > have-watchdog: false > last-lrm-refresh: 1620742514 > stonith-enabled: FALSE > > Tags: > No tags defined > > Quorum: > Options: > > > > > > Any suggestions are welcome > > best regards stay safe, take care > > fatcharly > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] fencing
On 2021-05-07 6:36 a.m., Kyle O'Donnell wrote: > Hi Everyone. > > We've setup fencing with our ilo/idrac interfaces and things generally > work well but during some of our failover scenario testing we ran into > issues when we "failed' the switches in which those ilo/idrac interfaces > were connected. The issue was that resources were migrated away from > any node with an offline fencing device. I can see how that is > desirable, but in our case this is essentially a single point of > failure. How are others managing this? > > In one of our sites we have "smart" APC power strips so we can setup > multiple fencing devices, but in another site we do not. I tried > increasing the timeout= value on the fencing devices but that did not > seem to work. > > Thanks, > Kyle We use a pair of switched PDUs connected to a second switch (also, we do active/passive bonds for all links, each link in a bond going to different switches). This allows for either switch to be lost without interruption of network traffic and leaving one fence method available. Here's how we configure it to use IPMI (iDRAC, iRMC, iLO, etc) first, and to use a pair of PDUs as backup; pcs stonith create ipmilan_node1 fence_ipmilan pcmk_host_list="an-a02n01" ipaddr="10.201.13.1" password="another secret p" username="admin" delay="15" op monitor interval="60" pcs stonith level add 1 an-a02n01 ipmilan_node1 pcs stonith create ipmilan_node2 fence_ipmilan pcmk_host_list="an-a02n02" ipaddr="10.201.13.2" password="another secret p" username="admin" op monitor interval="60" pcs stonith level add 1 an-a02n02 ipmilan_node2 pcs stonith create apc_snmp_node1_psu1 fence_apc_snmp pcmk_host_list="an-a02n01" pcmk_off_action="reboot" ip="10.201.2.3" port="3" power_wait="5" op monitor interval="60" pcs stonith create apc_snmp_node1_psu2 fence_apc_snmp pcmk_host_list="an-a02n01" pcmk_off_action="reboot" ip="10.201.2.4" port="3" power_wait="5" op monitor interval="60" pcs stonith level add 2 an-a02n01 apc_snmp_node1_psu1,apc_snmp_node1_psu2 pcs stonith create apc_snmp_node2_psu1 fence_apc_snmp pcmk_host_list="an-a02n02" pcmk_off_action="reboot" ip="10.201.2.3" port="4" power_wait="5" op monitor interval="60" pcs stonith create apc_snmp_node2_psu2 fence_apc_snmp pcmk_host_list="an-a02n02" pcmk_off_action="reboot" ip="10.201.2.4" port="4" power_wait="5" op monitor interval="60" pcs stonith level add 2 an-a02n02 apc_snmp_node2_psu1,apc_snmp_node2_psu2 pcs property set stonith-max-attempts=INFINITY pcs property set stonith-enabled=true In the above example, node 1 is plugged into outlet 3 on both PDUs, and node 2 is on outlet 4, with PDU 1 at IP 10.201.2.3 and PDU 2 at IP 10.201.2.4. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Stopping the last node with pcs
On 2021-04-28 10:10 a.m., Ken Gaillot wrote: > On Tue, 2021-04-27 at 23:23 -0400, Digimer wrote: >> Hi all, >> >> I noticed something odd. >> >> >> [root@an-a02n01 ~]# pcs cluster status >> Cluster Status: >> Cluster Summary: >>* Stack: corosync >>* Current DC: an-a02n01 (version 2.0.4-6.el8_3.2-2deceaa3ae) - >> partition with quorum >>* Last updated: Tue Apr 27 23:20:45 2021 >>* Last change: Tue Apr 27 23:12:40 2021 by root via cibadmin on >> an-a02n01 >>* 2 nodes configured >>* 12 resource instances configured (4 DISABLED) >> Node List: >>* Online: [ an-a02n01 ] >>* OFFLINE: [ an-a02n02 ] >> >> PCSD Status: >> an-a02n01: Online >> an-a02n02: Offline >> >> [root@an-a02n01 ~]# pcs cluster stop >> Error: Stopping the node will cause a loss of the quorum, use --force >> to >> override >> >> >> Shouldn't pcs know it's the last node and shut down without >> complaint? > > It knows, it's just not sure you know :) > > pcs's design philosophy is to hand-hold users by default and give > expert users --force. > > The idea in this case is that (especially in 3-to-5-node clusters) > someone might not realize that stopping one node could make all > resources stop cluster-wide. This makes total sense in 3+ node cluster. However, when you're asking the last node in a two-node cluster to stop, then it seems odd. Perhaps overriding this behaviour when 2-node is set? In any case, I'm calling this from a program and that means I need to use '--force' all the time (or add some complex logic of my own, which I can do). Well anyway, now I know it was intentional. :) -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Stopping the last node with pcs
Hi all, I noticed something odd. [root@an-a02n01 ~]# pcs cluster status Cluster Status: Cluster Summary: * Stack: corosync * Current DC: an-a02n01 (version 2.0.4-6.el8_3.2-2deceaa3ae) - partition with quorum * Last updated: Tue Apr 27 23:20:45 2021 * Last change: Tue Apr 27 23:12:40 2021 by root via cibadmin on an-a02n01 * 2 nodes configured * 12 resource instances configured (4 DISABLED) Node List: * Online: [ an-a02n01 ] * OFFLINE: [ an-a02n02 ] PCSD Status: an-a02n01: Online an-a02n02: Offline [root@an-a02n01 ~]# pcs cluster stop Error: Stopping the node will cause a loss of the quorum, use --force to override Shouldn't pcs know it's the last node and shut down without complaint? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Cyberpower PDU41001 fencing agent?
On 2021-04-25 1:28 a.m., Jeremy Hansen wrote: > Just curious if there’s any work being done on Cyberpower PDUs. I’d like to > use this PDU with Cobbler, but it doesn’t look like there’s a fencing agent > available. > > Thanks > -jeremy I don't own one, but I've written a couple PDU-based fence agents. They seem to support SNMP and per-outlet switching, so I suspect supporting it would be fairly easy. You'll need to know some OIDs, and the community that allows write access, to set the outlet states. Can you use 'snmpwalk' to collect the data from the PDU? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Single-node automated startup question
Hi all, As we get close to finish our Anvil! switch to pacemaker, I'm trying to tie up loose ends. One that I want feedback on is the pacemaker version of cman's old 'post_join_delay' feature. Use case example; A common use for the Anvil! is remote deployments where there is no (IT) humans available. Think cargo ships, field data collection, etc. So it's entirely possible that a node could fail and not be repaired for weeks or even months. With this in mind, it's also feasible that a solo node later loses power, and then reboots. In such a case, 'pcs cluster start' would never go quorate as the peer is dead. In cman, during startup, if there was no reply from the peer after post_join_delay seconds, the peer would get fenced and then the cluster would finish coming up. Being two_node, it would also become quorate and start hosting services. Of course, this opens the risk of a fence loop, but we have other protections in place to prevent that, so a fence loop is not a concern. My question then is two-fold; 1. Is there a pacemaker equivalent to 'post_join_delay'? (Fence the peer and, if successful, become quorate)? 2. If not, was this a conscious decision not to add it for some reason, or was it simply never added? If it was consciously decided to not have it, what was the reasoning behind it? I can replicate this behaviour in our code, but I don't want to do that if there is a compelling reason that I am not aware of. So, A) is there a pacemaker version of post_join_delay? B) is there a compelling argument NOT to use post_join_delay behaviour in pacemaker I am not seeing? Thanks! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Antw: RE: Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway
On 2021-03-05 12:26 p.m., Klaus Wenninger wrote: > On 3/5/21 6:04 PM, Digimer wrote: >> On 2021-03-05 2:14 a.m., Ulrich Windl wrote: >>>>> How would the fencing be confirmed? I don't know. >>>> It's part of the FenceAgentAPI. The cluster invokes the fence agent, >>>> passes in variable=value pairs on STDIN, and waits for the agent to >>>> exit. It reads the agent's exit code and uses that to determine success >>>> or failure. >>> But the agent "acting remote" cannot be sure the "remote end" was killed, >>> specifically when the network connection seems dead. >>> I see that in the IPMI case you have a separate connection allowing >>> "out-of-band signaling", but in the general case that would not be possible. >> To elaborate on Klaus's reply; >> >> The cluster has no control over how the fence agent works, it can only >> dictate the API and expect the fence agent is implemented in a sane way. >> If your agent returns success, but the node wasn't confirmed off >> properly in the agent, you will get a split-brain and that will be no >> fault of the cluster itself. >> >> Speaking to the "remote end" part; >> >> All good fence agents need to work regardless of the state of the target >> node. If, somehow, a fence agent needs the target to be in some sort of >> defines state, it is a critically flawed fence agent. A classic example >> of this is the often-requested "ssh fence agent" (and it's why such an >> agent doesn't exist). >> >> So your fence agent must be able to work out of band, by definition and >> design. When you call an IPMI BMC, you are effectively talking to a >> different mini computer on the target. Even then, if the mainboard >> utterly dies and takes the BMC with it, it will fail to fence as well. >> This is why at Alteeve we always have a backup fence method, switched >> PDUs on different switches from the IPMI BMC connections. >> >> Fencing really is critical, and as such, it should be certain to work, >> and ideally, have a backup fence method. So if you find that your >> fence-azure agent isn't reliable, and you can use SBD as Klaus >> mentioned, you can configure fence-sbd as a backup method to fence-azure. >> > Nothing to add - to the point as usually - but that the statement from > Ulrich lookedgeneral - not necessarily azure specific - and thus my > comment was as well. > Just wanted to state that I didn't advertise SBD as fencing method > forazure. SBD needs a reliable watchdog and afaik softdog is the only > watchdogyou have on azure (maybe different for certain BareMetal > offerings). > If you consider that reliable enough you have to negotiate with your own > conscience or the provider of your distribution ;-) > > Klaus I know nothing of Azure. If you don't have hardware watchdog, fence-sbd is not reliable. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] [ClusterLabs Developers] fence-virt: consider to merge into fence-agents git repository
On 2021-03-05 3:34 a.m., Oyvind Albrigtsen wrote: > Hi, > > We are considering to merge the fence-virt repo into the fence-agents > git repository. > > Tell us if you have any objections to merging the git repositories, so > we can take any objections into consideration. +1 to merge -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Antw: RE: Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway
On 2021-03-05 2:14 a.m., Ulrich Windl wrote: >>> How would the fencing be confirmed? I don't know. >> >> It's part of the FenceAgentAPI. The cluster invokes the fence agent, >> passes in variable=value pairs on STDIN, and waits for the agent to >> exit. It reads the agent's exit code and uses that to determine success >> or failure. > > But the agent "acting remote" cannot be sure the "remote end" was killed, > specifically when the network connection seems dead. > I see that in the IPMI case you have a separate connection allowing > "out-of-band signaling", but in the general case that would not be possible. To elaborate on Klaus's reply; The cluster has no control over how the fence agent works, it can only dictate the API and expect the fence agent is implemented in a sane way. If your agent returns success, but the node wasn't confirmed off properly in the agent, you will get a split-brain and that will be no fault of the cluster itself. Speaking to the "remote end" part; All good fence agents need to work regardless of the state of the target node. If, somehow, a fence agent needs the target to be in some sort of defines state, it is a critically flawed fence agent. A classic example of this is the often-requested "ssh fence agent" (and it's why such an agent doesn't exist). So your fence agent must be able to work out of band, by definition and design. When you call an IPMI BMC, you are effectively talking to a different mini computer on the target. Even then, if the mainboard utterly dies and takes the BMC with it, it will fail to fence as well. This is why at Alteeve we always have a backup fence method, switched PDUs on different switches from the IPMI BMC connections. Fencing really is critical, and as such, it should be certain to work, and ideally, have a backup fence method. So if you find that your fence-azure agent isn't reliable, and you can use SBD as Klaus mentioned, you can configure fence-sbd as a backup method to fence-azure. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: RE: Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway
On 2021-03-03 6:53 p.m., Eric Robinson wrote: > >> -Original Message- >> From: Users On Behalf Of Ulrich Windl >> Sent: Wednesday, March 3, 2021 12:57 AM >> To: users@clusterlabs.org >> Subject: [ClusterLabs] Antw: RE: Antw: [EXT] Re: "Error: unable to fence >> '001db02a'" but It got fenced anyway >> >>>>> Eric Robinson schrieb am 02.03.2021 um >>>>> 19:26 in >> Nachricht >> > 3.prod.outlook.com> >> >>>> -Original Message- >>>> From: Users On Behalf Of Digimer >>>> Sent: Monday, March 1, 2021 11:02 AM >>>> To: Cluster Labs - All topics related to open-source clustering >>>> welcomed ; Ulrich Windl >>>> >>>> Subject: Re: [ClusterLabs] Antw: [EXT] Re: "Error: unable to fence >>> '001db02a'" >> ... >>>>>> Cloud fencing usually requires a higher timeout (20s reported here). >>>>>> >>>>>> Microsoft seems to suggest the following setup: >>>>>> >>>>>> # pcs property set stonith‑timeout=900 >>>>> >>>>> But doesn't that mean the other node waits 15 minutes after stonith >>>>> until it performs the first post-stonith action? >>>> >>>> No, it means that if there is no reply by then, the fence has failed. >>>> If >> the >>>> fence happens sooner, and the caller is told this, recovery begins >>>> very >>> shortly >>>> after. >> >> How would the fencing be confirmed? I don't know. >> >> >>>> >>> >>> Interesting. Since users often report application failure within 1-3 >>> minutes >> >>> and may engineers begin investigating immediately, a technician could >>> end up >> >>> connecting to a cluster node after the stonith command was called, and >>> could >> >>> conceivably bring a failed node back up manually, only to have Azure >>> finally get around to shooting it in the head. I don't suppose there's >>> a way to abort/cancel a STONITH operation that is in progress? >> >> I think you have to decide: Let the cluster handle the problem, or let the >> admin handle the problem, but preferrably not both. >> I also think you cannot cancel a STONITH; you can only confirm it. >> >> Regards, >> Ulrich >> > > Standing by and letting the cluster handle the problem is a hard pill to > swallow when a technician could resolve things and bring services back up > sooner, but I get your point. In all my years, I've learned to trust carefully reviewed code to do the right thing over humans. Outside HA specialists, most people setup HA and forget it, often for months or even years. The idea that they would remember what to do, accurately, while there is also a major outage is, to me, a much harder pill to swallow. A well tested HA cluster that is designed properly will have a far, far higher chance of quickly and efficiently recover services during an outage. This is extra true if the problem arises after beer o'clock on a Friday evening of the first day of vacation. Be careful not to confuse the effort needed to do the initial, proper build and testing with the reliability of the system. The more thorough you are during building, the more reliable your system over time. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: RE: Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway
On 2021-03-03 1:56 a.m., Ulrich Windl wrote: >>>> Eric Robinson schrieb am 02.03.2021 um 19:26 in > Nachricht > > >>> -Original Message- >>> From: Users On Behalf Of Digimer >>> Sent: Monday, March 1, 2021 11:02 AM >>> To: Cluster Labs - All topics related to open-source clustering welcomed >>> ; Ulrich Windl >>> Subject: Re: [ClusterLabs] Antw: [EXT] Re: "Error: unable to fence >> '001db02a'" > ... >>>>> Cloud fencing usually requires a higher timeout (20s reported here). >>>>> >>>>> Microsoft seems to suggest the following setup: >>>>> >>>>> # pcs property set stonith‑timeout=900 >>>> >>>> But doesn't that mean the other node waits 15 minutes after stonith >>>> until it performs the first post-stonith action? >>> >>> No, it means that if there is no reply by then, the fence has failed. If > the >>> fence happens sooner, and the caller is told this, recovery begins very >> shortly >>> after. > > How would the fencing be confirmed? I don't know. It's part of the FenceAgentAPI. The cluster invokes the fence agent, passes in variable=value pairs on STDIN, and waits for the agent to exit. It reads the agent's exit code and uses that to determine success or failure. So if the fence agent is invoked and 5 seconds later, it exits with the "success" RC, the cluster knows the peer is gone and that it can now safely begin recovery. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway
On 2021-03-01 2:50 a.m., Ulrich Windl wrote: >>>> Valentin Vidic schrieb am 28.02.2021 um > 16:59 > in Nachricht <20210228155921.gm29...@valentin-vidic.from.hr>: >> On Sun, Feb 28, 2021 at 03:34:20PM +, Eric Robinson wrote: >>> 001db02b rebooted. After it came back up, I tried it in the other > direction. >>> >>> On node 001db02b, the command... >>> >>> # pcs stonith fence 001db02a >>> >>> ...produced output... >>> >>> Error: unable to fence '001db02a'. >>> >>> However, node 001db02a did get restarted! >>> >>> We also saw this error... >>> >>> Failed Actions: >>> * stonith‑001db02ab_start_0 on 001db02a 'unknown error' (1): call=70, >> status=Timed Out, exitreason='', >>> last‑rc‑change='Sun Feb 28 10:11:10 2021', queued=0ms, exec=20014ms >>> >>> When that happens, does Pacemaker take over the other node's resources, or > >> not? >> >> Cloud fencing usually requires a higher timeout (20s reported here). >> >> Microsoft seems to suggest the following setup: >> >> # pcs property set stonith‑timeout=900 > > But doesn't that mean the other node waits 15 minutes after stonith until it > performs the first post-stonith action? No, it means that if there is no reply by then, the fence has failed. If the fence happens sooner, and the caller is told this, recovery begins very shortly after. >> # pcs stonith create rsc_st_azure fence_azure_arm username="login ID" >> password="password" resourceGroup="resource group" tenantId="tenant ID" >> subscriptionId="subscription id" >> > pcmk_host_map="prod‑cl1‑0:prod‑cl1‑0‑vm‑name;prod‑cl1‑1:prod‑cl1‑1‑vm‑name" >> power_timeout=240 pcmk_reboot_timeout=900 pcmk_monitor_timeout=120 >> pcmk_monitor_retries=4 pcmk_action_limit=3 >> op monitor interval=3600 >> >> > https://docs.microsoft.com/en‑us/azure/virtual‑machines/workloads/sap/high‑avai > >> lability‑guide‑rhel‑pacemaker >> >> ‑‑ >> Valentin >> ___ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?
On 2021-02-26 12:23 p.m., Eric Robinson wrote: >> -Original Message- >> From: Digimer >> Sent: Friday, February 26, 2021 10:35 AM >> To: Cluster Labs - All topics related to open-source clustering welcomed >> ; Eric Robinson >> Subject: Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went >> Down Anyway? >> >> On 2021-02-26 11:19 a.m., Eric Robinson wrote: >>> At 5:16 am Pacific time Monday, one of our cluster nodes failed and >>> its mysql services went down. The cluster did not automatically recover. >>> >>> We're trying to figure out: >>> >>> 1. Why did it fail? >>> 2. Why did it not automatically recover? >>> >>> The cluster did not recover until we manually executed... >>> >>> # pcs resource cleanup p_mysql_622 >>> >>> OS: CentOS Linux release 7.5.1804 (Core) >>> >>> Cluster version: >>> >>> corosync.x86_64 2.4.5-4.el7 @base >>> corosync-qdevice.x86_64 2.4.5-4.el7 @base >>> pacemaker.x86_64 1.1.21-4.el7@base >>> >>> Two nodes: 001db01a, 001db01b >>> >>> The following log snippet is from node 001db01a: >>> >>> [root@001db01a cluster]# grep "Feb 22 05:1[67]" corosync.log-20210223 >> >> >> >>> Feb 22 05:16:30 [91682] 001db01apengine: warning: cluster_status: >> Fencing and resource management disabled due to lack of quorum >> >> Seems like there was no quorum from this node's perspective, so it won't do >> anything. What does the other node's logs say? >> > > The logs from the other node are at the bottom of the original email. > >> What is the cluster configuration? Do you have stonith (fencing) configured? > > 2-node with a separate qdevice. No fencing. > >> Quorum is a useful tool when things are working properly, but it doesn't help >> when things enter an undefined / unexpected state. >> When that happens, stonith saves you. So said another way, you must have >> stonith for a stable cluster, quorum is optional. >> > > In this case, if fencing was enabled, which node would have fenced the other? > Would they have gotten into a STONITH war? You can set a preference for which node wins by assigning a fence delay to your preferred node. So say your services were running on node 1, you put the delay on the fence method that shoots node 1. So in a case like this, node 2 looks up how to fence node 1, sees the delay, and waits. Node 1 looks up how to fence node 2, sees no delay, and fences immediately. If, however, node 1 was actually dead, then after the delay (typically 15 seconds), node proceeds with the fence and takes over the lost services. Without fencing / stonith, what happens during a failure is undetermined. All production clusters really must have fencing. If you also have quorum, then the delay doesn't matter and the node that maintains contact with the quorum node wins. However, if something breaks all cluster communications (corosync, specifically), both nodes lose quorum and neither recover. For this reason, I never bother with quorum (set the two-node flag), and just rely on fencing. Takes avoidable complexity out of the system. > More importantly, why did the failure of resource p_mysql_622 keep the whole > cluster from recovering? As soon as I did 'pcs resource cleanup p_mysql_622' > all the other resources recovered, but none of them are dependent on that > resource. > >> -- >> Digimer >> Papers and Projects: https://alteeve.com/w/ "I am, somehow, less >> interested in the weight and convolutions of Einstein's brain than in the >> near >> certainty that people of equal talent have lived and died in cotton fields >> and >> sweatshops." - Stephen Jay Gould > Disclaimer : This email and any files transmitted with it are confidential > and intended solely for intended recipients. If you are not the named > addressee you should not disseminate, distribute, copy or alter this email. > Any views or opinions presented in this email are solely those of the author > and might not represent those of Physician Select Management. Warning: > Although Physician Select Management has taken reasonable precautions to > ensure no viruses are present in this email, the company cannot accept > responsibility for any loss or damage arising from the use of this email or > attachments. > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?
On 2021-02-26 11:19 a.m., Eric Robinson wrote: > At 5:16 am Pacific time Monday, one of our cluster nodes failed and its > mysql services went down. The cluster did not automatically recover. > > We’re trying to figure out: > > 1. Why did it fail? > 2. Why did it not automatically recover? > > The cluster did not recover until we manually executed… > > # pcs resource cleanup p_mysql_622 > > OS: CentOS Linux release 7.5.1804 (Core) > > Cluster version: > > corosync.x86_64 2.4.5-4.el7 @base > corosync-qdevice.x86_64 2.4.5-4.el7 @base > pacemaker.x86_64 1.1.21-4.el7 @base > > Two nodes: 001db01a, 001db01b > > The following log snippet is from node 001db01a: > > [root@001db01a cluster]# grep "Feb 22 05:1[67]" corosync.log-20210223 > Feb 22 05:16:30 [91682] 001db01apengine: warning: cluster_status: > Fencing and resource management disabled due to lack of quorum Seems like there was no quorum from this node's perspective, so it won't do anything. What does the other node's logs say? What is the cluster configuration? Do you have stonith (fencing) configured? Quorum is a useful tool when things are working properly, but it doesn't help when things enter an undefined / unexpected state. When that happens, stonith saves you. So said another way, you must have stonith for a stable cluster, quorum is optional. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Stop timeout=INFINITY not working
On 2021-01-27 2:29 a.m., Ulrich Windl wrote: >>>> Ken Gaillot schrieb am 26.01.2021 um 16:08 in > Nachricht > : >> On Tue, 2021‑01‑26 at 02:12 ‑0500, Digimer wrote: >>> Hi all, >>> >>> I created a resource with an INFINITE stop timeout; >>> >>> pcs resource create srv01‑test ocf:alteeve:server name="srv01‑test" >>> meta >>> allow‑migrate="true" target‑role="stopped" op monitor interval="60" >>> start timeout="INFINITY" on‑fail="block" stop timeout="INFINITY" >>> on‑fail="block" migrate_to timeout="INFINITY" >> >> I hadn't noticed this before, but it looks like INFINITY is not allowed >> in time interval specifications, and there's no log warning about it. >> :‑/ > > Hi! > > I was wondering why someone would set a timeout to something like a day or > more: > To give the operator a chance to investigate and fix problems before the > cluster tries recovery? > > Regards, > Ulrich Windows. Microsoft decided a while back that the perfect time to install OS updates was when a windows server or workstation was told to shut down. Back in the rgmanager days, this was a problem because rgmanager terminated a resource that didn't stop in two minutes. So while a client's windows server VM was saying "Do not power off your computer!", the cluster pulled the plug. So we set an INFINITE (well, that needs to change now) so that if this happened, the cluster would keep waiting. It's dumb, but it is what it is. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Stopping all nodes causes servers to migrate
On 2021-01-26 11:27 a.m., Ken Gaillot wrote: > On Tue, 2021-01-26 at 11:03 -0500, Digimer wrote: >> On 2021-01-26 10:15 a.m., Tomas Jelinek wrote: >>> Dne 25. 01. 21 v 17:01 Ken Gaillot napsal(a): >>>> On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais >>>> wrote: >>>>> Hi Digimer, >>>>> >>>>> On Sun, 24 Jan 2021 15:31:22 -0500 >>>>> Digimer wrote: >>>>> [...] >>>>>> I had a test server (srv01-test) running on node 1 (el8- >>>>>> a01n01), >>>>>> and on >>>>>> node 2 (el8-a01n02) I ran 'pcs cluster stop --all'. >>>>>> >>>>>>It appears like pacemaker asked the VM to migrate to node >>>>>> 2 >>>>>> instead of >>>>>> stopping it. Once the server was on node 2, I couldn't use >>>>>> 'pcs >>>>>> resource >>>>>> disable ' as it returned that that resource was >>>>>> unmanaged, and >>>>>> the >>>>>> cluster shut down was hung. When I directly stopped the VM >>>>>> and then >>>>>> did >>>>>> a 'pcs resource cleanup', the cluster shutdown completed. >>>>> >>>>> As actions during a cluster shutdown cannot be handled in the >>>>> same >>>>> transition >>>>> for each nodes, I usually add a step to disable all resources >>>>> using >>>>> property >>>>> "stop-all-resources" before shutting down the cluster: >>>>> >>>>>pcs property set stop-all-resources=true >>>>>pcs cluster stop --all >>>>> >>>>> But it seems there's a very new cluster property to handle that >>>>> (IIRC, one or >>>>> two releases ago). Look at "shutdown-lock" doc: >>>>> >>>>>[...] >>>>>some users prefer to make resources highly available only >>>>> for >>>>> failures, with >>>>>no recovery for clean shutdowns. If this option is true, >>>>> resources >>>>> active on a >>>>>node when it is cleanly shut down are kept "locked" to that >>>>> node >>>>> (not allowed >>>>>to run elsewhere) until they start again on that node after >>>>> it >>>>> rejoins (or >>>>>for at most shutdown-lock-limit, if set). >>>>>[...] >>>>> >>>>> [...] >>>>>>So as best as I can tell, pacemaker really did ask for a >>>>>> migration. Is >>>>>> this the case? >>>>> >>>>> AFAIK, yes, because each cluster shutdown request is handled >>>>> independently at >>>>> node level. There's a large door open for all kind of race >>>>> conditions >>>>> if >>>>> requests are handled with some random lags on each nodes. >>>> >>>> I'm going to guess that's what happened. >>>> >>>> The basic issue is that there is no "cluster shutdown" in >>>> Pacemaker, >>>> only "node shutdown". I'm guessing "pcs cluster stop --all" sends >>>> shutdown requests for each node in sequence (probably via >>>> systemd), and >>>> if the nodes are quick enough, one could start migrating off >>>> resources >>>> before all the others get their shutdown request. >>> >>> Pcs is doing its best to stop nodes in parallel. The first >>> implementation of this was done back in 2015: >>> https://bugzilla.redhat.com/show_bug.cgi?id=1180506 >>> Since then, we moved to using curl for network communication, which >>> also >>> handles parallel cluster stop. Obviously, this doesn't ensure the >>> stop >>> command arrives to and is processed on all nodes at the exactly >>> same time. >>> >>> Basically, pcs sends 'stop pacemaker' request to all nodes in >>> parallel >>> and waits for it to finish on all nodes. Then it sends 'stop >>> corosync' >>> request to all nodes in parallel. The actual stopping on each node >>> is >>> done by 'systemctl stop'. >>> >>> Yes, the nodes which get the request sooner may start migrating >>> resources. >>> >>> Regards, >>> Tomas >> >> Given the case I had, where a resource went unmanaged and the stop >> hung >> indefinitely, would that be considered a bug? > > That depends on why. You'll have to check the logs around that time to > see if there are any details. It would be considered appropriate if > e.g. an action with on-fail=block failed. OK, I'll try to reproduce and, if I can, post the logs. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Stopping all nodes causes servers to migrate
On 2021-01-26 10:15 a.m., Tomas Jelinek wrote: > Dne 25. 01. 21 v 17:01 Ken Gaillot napsal(a): >> On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais wrote: >>> Hi Digimer, >>> >>> On Sun, 24 Jan 2021 15:31:22 -0500 >>> Digimer wrote: >>> [...] >>>> I had a test server (srv01-test) running on node 1 (el8-a01n01), >>>> and on >>>> node 2 (el8-a01n02) I ran 'pcs cluster stop --all'. >>>> >>>> It appears like pacemaker asked the VM to migrate to node 2 >>>> instead of >>>> stopping it. Once the server was on node 2, I couldn't use 'pcs >>>> resource >>>> disable ' as it returned that that resource was unmanaged, and >>>> the >>>> cluster shut down was hung. When I directly stopped the VM and then >>>> did >>>> a 'pcs resource cleanup', the cluster shutdown completed. >>> >>> As actions during a cluster shutdown cannot be handled in the same >>> transition >>> for each nodes, I usually add a step to disable all resources using >>> property >>> "stop-all-resources" before shutting down the cluster: >>> >>> pcs property set stop-all-resources=true >>> pcs cluster stop --all >>> >>> But it seems there's a very new cluster property to handle that >>> (IIRC, one or >>> two releases ago). Look at "shutdown-lock" doc: >>> >>> [...] >>> some users prefer to make resources highly available only for >>> failures, with >>> no recovery for clean shutdowns. If this option is true, resources >>> active on a >>> node when it is cleanly shut down are kept "locked" to that node >>> (not allowed >>> to run elsewhere) until they start again on that node after it >>> rejoins (or >>> for at most shutdown-lock-limit, if set). >>> [...] >>> >>> [...] >>>> So as best as I can tell, pacemaker really did ask for a >>>> migration. Is >>>> this the case? >>> >>> AFAIK, yes, because each cluster shutdown request is handled >>> independently at >>> node level. There's a large door open for all kind of race conditions >>> if >>> requests are handled with some random lags on each nodes. >> >> I'm going to guess that's what happened. >> >> The basic issue is that there is no "cluster shutdown" in Pacemaker, >> only "node shutdown". I'm guessing "pcs cluster stop --all" sends >> shutdown requests for each node in sequence (probably via systemd), and >> if the nodes are quick enough, one could start migrating off resources >> before all the others get their shutdown request. > > Pcs is doing its best to stop nodes in parallel. The first > implementation of this was done back in 2015: > https://bugzilla.redhat.com/show_bug.cgi?id=1180506 > Since then, we moved to using curl for network communication, which also > handles parallel cluster stop. Obviously, this doesn't ensure the stop > command arrives to and is processed on all nodes at the exactly same time. > > Basically, pcs sends 'stop pacemaker' request to all nodes in parallel > and waits for it to finish on all nodes. Then it sends 'stop corosync' > request to all nodes in parallel. The actual stopping on each node is > done by 'systemctl stop'. > > Yes, the nodes which get the request sooner may start migrating resources. > > Regards, > Tomas Given the case I had, where a resource went unmanaged and the stop hung indefinitely, would that be considered a bug? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Stop timeout=INFINITY not working
Hi all, I created a resource with an INFINITE stop timeout; pcs resource create srv01-test ocf:alteeve:server name="srv01-test" meta allow-migrate="true" target-role="stopped" op monitor interval="60" start timeout="INFINITY" on-fail="block" stop timeout="INFINITY" on-fail="block" migrate_to timeout="INFINITY" Then I tried stopping it (on a highly loaded system) and it timed out after just 20 seconds and got flagged as failed; Jan 26 07:06:19 el8-a01n01.alteeve.ca pacemaker-controld[1846038]: notice: High CPU load detected: 3.57 Jan 26 07:06:49 el8-a01n01.alteeve.ca pacemaker-controld[1846038]: notice: High CPU load detected: 3.48 Jan 26 07:07:05 el8-a01n01.alteeve.ca pacemaker-controld[1846038]: notice: State transition S_IDLE -> S_POLICY_ENGINE Jan 26 07:07:05 el8-a01n01.alteeve.ca pacemaker-schedulerd[1846037]: notice: * Stop srv01-test ( el8-a01n01 ) due to node availability Jan 26 07:07:05 el8-a01n01.alteeve.ca pacemaker-schedulerd[1846037]: notice: Calculated transition 179, saving inputs in /var/lib/pacemaker/pengine/pe-input-76.bz2 Jan 26 07:07:05 el8-a01n01.alteeve.ca pacemaker-controld[1846038]: notice: Initiating stop operation srv01-test_stop_0 locally on el8-a01n01 Jan 26 07:07:19 el8-a01n01.alteeve.ca pacemaker-controld[1846038]: notice: High CPU load detected: 3.85 Jan 26 07:07:25 el8-a01n01.alteeve.ca kernel: drbd srv01-test: role( Primary -> Secondary ) Jan 26 07:07:25 el8-a01n01.alteeve.ca pacemaker-execd[1846035]: warning: srv01-test_stop_0 process (PID 2647133) timed out Jan 26 07:07:25 el8-a01n01.alteeve.ca pacemaker-execd[1846035]: warning: srv01-test_stop_0[2647133] timed out after 2ms Jan 26 07:07:25 el8-a01n01.alteeve.ca pacemaker-controld[1846038]: error: Result of stop operation for srv01-test on el8-a01n01: Timed Out Jan 26 07:07:25 el8-a01n01.alteeve.ca pacemaker-controld[1846038]: notice: el8-a01n01-srv01-test_stop_0:89 [ The server: [srv01-test] is indeed running. It will be shut down now.\n ] Did I not configure the stop timeout correctly? Thanks for any insight. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Stopping all nodes causes servers to migrate
On 2021-01-25 3:58 p.m., Ken Gaillot wrote: > On Mon, 2021-01-25 at 13:18 -0500, Digimer wrote: >> On 2021-01-25 11:01 a.m., Ken Gaillot wrote: >>> On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais >>> wrote: >>>> Hi Digimer, >>>> >>>> On Sun, 24 Jan 2021 15:31:22 -0500 >>>> Digimer wrote: >>>> [...] >>>>> I had a test server (srv01-test) running on node 1 (el8- >>>>> a01n01), >>>>> and on >>>>> node 2 (el8-a01n02) I ran 'pcs cluster stop --all'. >>>>> >>>>> It appears like pacemaker asked the VM to migrate to node 2 >>>>> instead of >>>>> stopping it. Once the server was on node 2, I couldn't use 'pcs >>>>> resource >>>>> disable ' as it returned that that resource was unmanaged, >>>>> and >>>>> the >>>>> cluster shut down was hung. When I directly stopped the VM and >>>>> then >>>>> did >>>>> a 'pcs resource cleanup', the cluster shutdown completed. >>>> >>>> As actions during a cluster shutdown cannot be handled in the >>>> same >>>> transition >>>> for each nodes, I usually add a step to disable all resources >>>> using >>>> property >>>> "stop-all-resources" before shutting down the cluster: >>>> >>>> pcs property set stop-all-resources=true >>>> pcs cluster stop --all >>>> >>>> But it seems there's a very new cluster property to handle that >>>> (IIRC, one or >>>> two releases ago). Look at "shutdown-lock" doc: >>>> >>>> [...] >>>> some users prefer to make resources highly available only for >>>> failures, with >>>> no recovery for clean shutdowns. If this option is true, >>>> resources >>>> active on a >>>> node when it is cleanly shut down are kept "locked" to that >>>> node >>>> (not allowed >>>> to run elsewhere) until they start again on that node after it >>>> rejoins (or >>>> for at most shutdown-lock-limit, if set). >>>> [...] >>>> >>>> [...] >>>>> So as best as I can tell, pacemaker really did ask for a >>>>> migration. Is >>>>> this the case? >>>> >>>> AFAIK, yes, because each cluster shutdown request is handled >>>> independently at >>>> node level. There's a large door open for all kind of race >>>> conditions >>>> if >>>> requests are handled with some random lags on each nodes. >>> >>> I'm going to guess that's what happened. >>> >>> The basic issue is that there is no "cluster shutdown" in >>> Pacemaker, >>> only "node shutdown". I'm guessing "pcs cluster stop --all" sends >>> shutdown requests for each node in sequence (probably via systemd), >>> and >>> if the nodes are quick enough, one could start migrating off >>> resources >>> before all the others get their shutdown request. >>> >>> There would be a way around it. Normally Pacemaker is shut down via >>> SIGTERM to pacemakerd (which is what systemctl stop does), but >>> inside >>> Pacemaker it's implemented as a special "shutdown" transient node >>> attribute, set to the epoch timestamp of the request. It would be >>> possible to set that attribute for all nodes in a copy of the CIB, >>> then >>> load that into the live cluster. >>> >>> stop-all-resources as suggested would be another way around it (and >>> would have to be cleared after start-up, which could be a plus or a >>> minus depending on how much control vs convenience you want). >> >> Thanks for your and everyone else's replies! >> >> I'm left curious about one part of this though; When the node >> migrated, >> the resource was then listed as unmanaged. So the resource was never >> requested to shutdown and the cluster shutdown on that node then >> hung. >> >> I can understand what's happening that triggered the migration, and I >> can understand how to prevent it in the future. (Truth be told, the >> Anvil! already would shut down all servers before
Re: [ClusterLabs] Stopping all nodes causes servers to migrate
On 2021-01-25 11:01 a.m., Ken Gaillot wrote: > On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais wrote: >> Hi Digimer, >> >> On Sun, 24 Jan 2021 15:31:22 -0500 >> Digimer wrote: >> [...] >>> I had a test server (srv01-test) running on node 1 (el8-a01n01), >>> and on >>> node 2 (el8-a01n02) I ran 'pcs cluster stop --all'. >>> >>> It appears like pacemaker asked the VM to migrate to node 2 >>> instead of >>> stopping it. Once the server was on node 2, I couldn't use 'pcs >>> resource >>> disable ' as it returned that that resource was unmanaged, and >>> the >>> cluster shut down was hung. When I directly stopped the VM and then >>> did >>> a 'pcs resource cleanup', the cluster shutdown completed. >> >> As actions during a cluster shutdown cannot be handled in the same >> transition >> for each nodes, I usually add a step to disable all resources using >> property >> "stop-all-resources" before shutting down the cluster: >> >> pcs property set stop-all-resources=true >> pcs cluster stop --all >> >> But it seems there's a very new cluster property to handle that >> (IIRC, one or >> two releases ago). Look at "shutdown-lock" doc: >> >> [...] >> some users prefer to make resources highly available only for >> failures, with >> no recovery for clean shutdowns. If this option is true, resources >> active on a >> node when it is cleanly shut down are kept "locked" to that node >> (not allowed >> to run elsewhere) until they start again on that node after it >> rejoins (or >> for at most shutdown-lock-limit, if set). >> [...] >> >> [...] >>> So as best as I can tell, pacemaker really did ask for a >>> migration. Is >>> this the case? >> >> AFAIK, yes, because each cluster shutdown request is handled >> independently at >> node level. There's a large door open for all kind of race conditions >> if >> requests are handled with some random lags on each nodes. > > I'm going to guess that's what happened. > > The basic issue is that there is no "cluster shutdown" in Pacemaker, > only "node shutdown". I'm guessing "pcs cluster stop --all" sends > shutdown requests for each node in sequence (probably via systemd), and > if the nodes are quick enough, one could start migrating off resources > before all the others get their shutdown request. > > There would be a way around it. Normally Pacemaker is shut down via > SIGTERM to pacemakerd (which is what systemctl stop does), but inside > Pacemaker it's implemented as a special "shutdown" transient node > attribute, set to the epoch timestamp of the request. It would be > possible to set that attribute for all nodes in a copy of the CIB, then > load that into the live cluster. > > stop-all-resources as suggested would be another way around it (and > would have to be cleared after start-up, which could be a plus or a > minus depending on how much control vs convenience you want). Thanks for your and everyone else's replies! I'm left curious about one part of this though; When the node migrated, the resource was then listed as unmanaged. So the resource was never requested to shutdown and the cluster shutdown on that node then hung. I can understand what's happening that triggered the migration, and I can understand how to prevent it in the future. (Truth be told, the Anvil! already would shut down all servers before calling the pacemaker stop, but I wanted to test possible fault conditions). Is it not a bug that the cluster was unable to stop after the migration? If I understand what's been said in this thread, the host node got a shutdown request so it migrated the resource. Then the peer (new host) would have gotten the shutdown request, should it then have seen the peer was gone and shut the resource down? Why did it enter an unmanaged state? Cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Stopping all nodes causes servers to migrate
Hi all, Right off the bat; I'm using a custom RA so this behaviour might be a bug in my agent. I had a test server (srv01-test) running on node 1 (el8-a01n01), and on node 2 (el8-a01n02) I ran 'pcs cluster stop --all'. It appears like pacemaker asked the VM to migrate to node 2 instead of stopping it. Once the server was on node 2, I couldn't use 'pcs resource disable ' as it returned that that resource was unmanaged, and the cluster shut down was hung. When I directly stopped the VM and then did a 'pcs resource cleanup', the cluster shutdown completed. In my agent, I noted these environment variables had been set; OCF_RESKEY_name= srv01-test OCF_RESKEY_CRM_meta_migrate_source = el8-a01n01 OCF_RESKEY_CRM_meta_migrate_target = el8-a01n02 OCF_RESKEY_CRM_meta_on_node= el8-a01n01 So as best as I can tell, pacemaker really did ask for a migration. Is this the case? If not, what environment variables should have been set in this scenario? Thanks for any insight! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Stopping a server failed and fenced, despite disabling stop timeout
On 2021-01-19 4:57 a.m., Tomas Jelinek wrote: > Dne 18. 01. 21 v 20:08 Digimer napsal(a): >> On 2021-01-18 4:49 a.m., Tomas Jelinek wrote: >>> Hi Digimer, >>> >>> Regarding pcs behavior: >>> >>> When deleting a resource, pcs first sets its target-role to Stopped, >>> pushes the change into pacemaker and waits for the resource to stop. >>> Once the resource stops, pcs removes the resource from CIB. If pcs >>> simply removed the resource from CIB without stopping it first, the >>> resource would be running as orphaned (until pacemaker stops it if >>> configured to do so). We want to avoid that. >>> >>> If the resource cannot be stopped for whatever reason, pcs reports this >>> and advises running the delete command with --force. Running 'pcs >>> resource delete --force' skips the part where pcs sets target role and >>> waits for the resource to stop, making pcs simply remove the resource >>> from CIB. >>> >>> I agree that pcs should handle deleting unmanaged resources in a better >>> way. We plan to address that, but it's not on top of the priority list. >>> Our plan is actually to prevent deleting unmanaged resources (or require >>> --force to be specified to do so) based on the following scenario: >>> >>> If a resource is deleted while in unmanaged state, it ends up in >>> ORPHANED state - it is removed from CIB but still present in running >>> configuration. This can cause various issues, i.e. when unmanaged >>> resource is stopped manually outside of the cluster there might be >>> problems with stopping the resource upon deletion (while unmanaged) >>> which may end up with stonith being initiated - this is not desired. >>> >>> >>> Regards, >>> Tomas >> >> This logic makes sense. If I may propose a reason for an alternative >> method; >> >> In my case, the idea I was experimenting with was to remove a running >> server from cluster management, without actually shutting down the >> server. This is somewhat contrived, I freely admin, but the idea of >> taking a server out of the config entirely without shutting it down >> could be useful in some cases. >> >> In my case, I didn't worry about the orphaned state and the risk of it >> trying to start elsewhere as there are additional safeguards in place to >> prevent this (both in our software and in that DRBD is not set to >> dual-primary, so the VM simply can't start elsewhere while it's running >> somewhere). >> >> Totally understand it's not a priority, but when this is addressed, some >> special mechanism to say "I know this will leave it orphaned and that's >> OK" would be nice to have. > > You can do it even now with "pcs resource delete --force". I admit it's > not the best way and an extra flag (--dont-stop or similar) would be > better. I wrote the idea into our notes so it doesn't get forgotten. > > Tomas Very much appreciated! Please let me know if/when that happens. :) -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Stopping a server failed and fenced, despite disabling stop timeout
On 2021-01-19 2:27 a.m., Ulrich Windl wrote: >>>> Digimer schrieb am 18.01.2021 um 20:08 in Nachricht > <64c1aa75-a15a-95c3-6853-e21fc0dc8...@alteeve.ca>: >> On 2021-01-18 4:49 a.m., Tomas Jelinek wrote: >>> Hi Digimer, >>> >>> Regarding pcs behavior: >>> >>> When deleting a resource, pcs first sets its target-role to Stopped, >>> pushes the change into pacemaker and waits for the resource to stop. >>> Once the resource stops, pcs removes the resource from CIB. If pcs >>> simply removed the resource from CIB without stopping it first, the >>> resource would be running as orphaned (until pacemaker stops it if >>> configured to do so). We want to avoid that. >>> >>> If the resource cannot be stopped for whatever reason, pcs reports this >>> and advises running the delete command with --force. Running 'pcs >>> resource delete --force' skips the part where pcs sets target role and >>> waits for the resource to stop, making pcs simply remove the resource >>> from CIB. >>> >>> I agree that pcs should handle deleting unmanaged resources in a better >>> way. We plan to address that, but it's not on top of the priority list. >>> Our plan is actually to prevent deleting unmanaged resources (or require >>> --force to be specified to do so) based on the following scenario: >>> >>> If a resource is deleted while in unmanaged state, it ends up in >>> ORPHANED state - it is removed from CIB but still present in running >>> configuration. This can cause various issues, i.e. when unmanaged >>> resource is stopped manually outside of the cluster there might be >>> problems with stopping the resource upon deletion (while unmanaged) >>> which may end up with stonith being initiated - this is not desired. >>> >>> >>> Regards, >>> Tomas >> >> This logic makes sense. If I may propose a reason for an alternative > method; >> >> In my case, the idea I was experimenting with was to remove a running >> server from cluster management, without actually shutting down the >> server. This is somewhat contrived, I freely admin, but the idea of >> taking a server out of the config entirely without shutting it down >> could be useful in some cases. > > Assuming that the server runs resources, I'd consider that to be highly > dangerous for data consistency. > If you want to remove the node from the cluster, why not shut down the cluster > node first? That would stop, move or migrate any resources running there. > Then you would not remove any resources that still run on the other node(s). > Basically I wonder what types of resources you would remove anyway. > Finally I would remove the node configuration from the cluster. > > Regards, > Ulrich To be clear, this won't be a "supported" condition, but I have seen it done and needed to do it before for odd reasons. Separately; In our case, we have external mechanisms that prevent a resource from running in two places. In our software, which acts as a logic layer over top of pacemaker, tracks a server's position directly (doesn't rely on pacemaker's idea of location, queries libvirtd directly. Secondly, in our new system, we run DRBD on a per-server basis and keep allow-two-primaries off, save for during a live migation. We've tested this and it properly refuses to start a VM on another node while it's running somewhere. So while your concern is absolutely valid, in our specific use case, we have additional safety systems that allow a VM to operate outside pacemaker without risking a split-brain. cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Stopping a server failed and fenced, despite disabling stop timeout
On 2021-01-18 1:52 p.m., Ken Gaillot wrote: > On Sun, 2021-01-17 at 21:11 -0500, Digimer wrote: >> Hi all, >> >> Mind the slew of questions, well into testing now and finding lots >> of >> issues. This one is two questions... :) >> >> I set a server to be unamaged in pacemaker while the server was >> running. Then I tried to remove the resource, and it refused saying >> it >> couldn't stop it, and to use '--force'. So I did, and the node got >> fenced. Now, the resource was setup with; >> >> pcs resource create srv07-el6 ocf:alteeve:server name="srv07-el6" \ >> meta allow-migrate="true" target-role="started" \ >> op monitor interval="60" start timeout="INFINITY" \ >> on-fail="block" stop timeout="INFINITY" on-fail="block" \ >> migrate_to timeout="INFINITY" >> >> I would have expected the 'stop timeout="INFINITY" on-fail="block"' >> to >> prevent fencing if the server failed to stop (question 1) and that if >> a >> resource was unmanaged, that the resource wouldn't even try to stop >> (question 2). > > It would have ... if you hadn't removed it :) Ahahaha! OK, I see now why it happened. I should have realized this. :) > Removing a resource means removing its configuration entry. Pacemaker > memorizes the resource agent parameters used to start a resource, so it > can still execute a stop when the configuration is removed. However > currently it does not memorize any resource meta-attributes or > operations that were configured, so those are lost when the resource is > removed. > > We maybe should remember more information about the resource than just > its agent parameters, but it should probably be selective rather than a > broad net. > > FYI, the stop-orphan-resources property controls whether a resource > whose configuration is removed should be stopped (defaulting to true). Oh! That's excellent. So if I know I want to remove a resource from pacemaker, but keep it running, I can update the resource's config to set that, then delete the resource and the service will still keep running? Do you know off hand what the pcs command would be to set that? No worries if not, I can sort it out easily enough I am sure. digimer >> Can someone help me understand what happened here? >> >> digimer >> >> More below; >> >> >> [root@el8-a01n01 ~]# pcs resource remove srv01-test >> Attempting to stop: srv01-test... Warning: 'srv01-test' is unmanaged >> Error: Unable to stop: srv01-test before deleting (re-run with -- >> force >> to force deletion) >> [root@el8-a01n01 ~]# pcs resource remove srv01-test --force >> Deleting Resource - srv01-test > > Note that the above is all within pcs, which tries to explicitly stop > the resource before removing it. That gives pcs more control over the > process and would ensure that the meta-attributes and operations were > still present for the stop. But as it warns, it can't do that if the > resource is unmanaged, so then it's left to Pacemaker to stop the > resource after the configuration is gone. > > BTW stop-orphan-resources takes precedence over whether the resource is > managed, so Pacemaker does try to stop it even though the node is still > unmanaged. > >> [root@el8-a01n01 ~]# client_loop: send disconnect: Broken pipe >> >> >> As you can see, the node was fenced. The logs on that node were; >> >> >> Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker- >> execd[1872]: warning: >> srv01-test_stop_0 process (PID 113779) timed out >> Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker- >> execd[1872]: warning: >> srv01-test_stop_0[113779] timed out after 2ms >> Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker- >> controld[1875]: error: >> Result of stop operation for srv01-test on el8-a01n01: Timed Out > > A stop timeout causes fencing. I'm guessing what happened in this case > is that the stop takes longer than 20s, which is the default operation > timeout, which has to be used because the original operation > configuration was removed. > > The main problem here is that removing a resource from the > configuration means that its stop timeout is lost, so if it's active > when it's removed, the stop might time out. pcs's approach of stopping > it first (by setting target-role to Stopped) makes sense, but having > the node unmanaged blocks that. It's a catch-22 for the cluster that > you'd have to resolve by cha
Re: [ClusterLabs] Stopping a server failed and fenced, despite disabling stop timeout
On 2021-01-18 4:49 a.m., Tomas Jelinek wrote: > Hi Digimer, > > Regarding pcs behavior: > > When deleting a resource, pcs first sets its target-role to Stopped, > pushes the change into pacemaker and waits for the resource to stop. > Once the resource stops, pcs removes the resource from CIB. If pcs > simply removed the resource from CIB without stopping it first, the > resource would be running as orphaned (until pacemaker stops it if > configured to do so). We want to avoid that. > > If the resource cannot be stopped for whatever reason, pcs reports this > and advises running the delete command with --force. Running 'pcs > resource delete --force' skips the part where pcs sets target role and > waits for the resource to stop, making pcs simply remove the resource > from CIB. > > I agree that pcs should handle deleting unmanaged resources in a better > way. We plan to address that, but it's not on top of the priority list. > Our plan is actually to prevent deleting unmanaged resources (or require > --force to be specified to do so) based on the following scenario: > > If a resource is deleted while in unmanaged state, it ends up in > ORPHANED state - it is removed from CIB but still present in running > configuration. This can cause various issues, i.e. when unmanaged > resource is stopped manually outside of the cluster there might be > problems with stopping the resource upon deletion (while unmanaged) > which may end up with stonith being initiated - this is not desired. > > > Regards, > Tomas This logic makes sense. If I may propose a reason for an alternative method; In my case, the idea I was experimenting with was to remove a running server from cluster management, without actually shutting down the server. This is somewhat contrived, I freely admin, but the idea of taking a server out of the config entirely without shutting it down could be useful in some cases. In my case, I didn't worry about the orphaned state and the risk of it trying to start elsewhere as there are additional safeguards in place to prevent this (both in our software and in that DRBD is not set to dual-primary, so the VM simply can't start elsewhere while it's running somewhere). Totally understand it's not a priority, but when this is addressed, some special mechanism to say "I know this will leave it orphaned and that's OK" would be nice to have. digimer > Dne 18. 01. 21 v 3:11 Digimer napsal(a): >> Hi all, >> >> Mind the slew of questions, well into testing now and finding lots of >> issues. This one is two questions... :) >> >> I set a server to be unamaged in pacemaker while the server was >> running. Then I tried to remove the resource, and it refused saying it >> couldn't stop it, and to use '--force'. So I did, and the node got >> fenced. Now, the resource was setup with; >> >> pcs resource create srv07-el6 ocf:alteeve:server name="srv07-el6" \ >> meta allow-migrate="true" target-role="started" \ >> op monitor interval="60" start timeout="INFINITY" \ >> on-fail="block" stop timeout="INFINITY" on-fail="block" \ >> migrate_to timeout="INFINITY" >> >> I would have expected the 'stop timeout="INFINITY" on-fail="block"' to >> prevent fencing if the server failed to stop (question 1) and that if a >> resource was unmanaged, that the resource wouldn't even try to stop >> (question 2). >> >> Can someone help me understand what happened here? >> >> digimer >> >> More below; >> >> >> [root@el8-a01n01 ~]# pcs resource remove srv01-test >> Attempting to stop: srv01-test... Warning: 'srv01-test' is unmanaged >> Error: Unable to stop: srv01-test before deleting (re-run with --force >> to force deletion) >> [root@el8-a01n01 ~]# pcs resource remove srv01-test --force >> Deleting Resource - srv01-test >> [root@el8-a01n01 ~]# client_loop: send disconnect: Broken pipe >> >> >> As you can see, the node was fenced. The logs on that node were; >> >> >> Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-execd[1872]: warning: >> srv01-test_stop_0 process (PID 113779) timed out >> Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-execd[1872]: warning: >> srv01-test_stop_0[113779] timed out after 2ms >> Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-controld[1875]: error: >> Result of stop operation for srv01-test on el8-a01n01: Timed Out >> Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-controld[1875]: notice: >
Re: [ClusterLabs] Antw: Antw: [EXT] Stopping a server failed and fenced, despite disabling stop timeout
On 2021-01-18 3:31 a.m., Ulrich Windl wrote: >>>> "Ulrich Windl" schrieb am 18.01.2021 um > 09:28 in Nachricht <6005469702a10003e...@gwsmtp.uni-regensburg.de>: >>>>> Digimer schrieb am 18.01.2021 um 03:11 in Nachricht >> <816a4d1e-a92d-2a4c-b1a0-cf4353e3f...@alteeve.ca>: >>> Hi all, >>> >>> Mind the slew of questions, well into testing now and finding lots of >>> issues. This one is two questions... :) >>> >>> I set a server to be unamaged in pacemaker while the server was >>> running. Then I tried to remove the resource, and it refused saying it >>> couldn't stop it, and to use '--force'. So I did, and the node got >>> fenced. Now, the resource was setup with; >> >> My guess is you shouldn't do it that way: Why not stop the resource, >> unconfigure it in the cluster, then start it manually? >> >>> >>> pcs resource create srv07-el6 ocf:alteeve:server name="srv07-el6" \ >>> meta allow-migrate="true" target-role="started" \ >>> op monitor interval="60" start timeout="INFINITY" \ >>> on-fail="block" stop timeout="INFINITY" on-fail="block" \ >>> migrate_to timeout="INFINITY" >>> >>> I would have expected the 'stop timeout="INFINITY" on-fail="block"' to >>> prevent fencing if the server failed to stop (question 1) and that if a >>> resource was unmanaged, that the resource wouldn't even try to stop >>> (question 2). >>> >>> Can someone help me understand what happened here? >> >> Fencing reason was " srv01-test_stop_0 process (PID 113779) timed out". >> >> Did have a failutre before your actions? The logs indicate such it seems: > > Sorry: "Did you have a failure before your actions?" I had, yes, but I cleared it. I'm intentionally doing "weird things" to see how the system reacts, and when things go bad (like this), what can be done to make the system more resilient. If I've learned anything in 10 years of HA, it's that people will do all the things you think they shouldn't do. So I'm trying to do them before they do and learn how to mitigate as much as possible. >> "Clearing failure of srv01-test on el8-a01n02 because resource parameters >> have changed" >> >> Haveing the cluster in a clean state before configuring it highly desirable >> IMHO. I use this command frequently to check: "crm_mon -1Arfj" >> >> The logs should help to explain! >> >> Regards, >> Ulrich > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Stopping a server failed and fenced, despite disabling stop timeout
Hi all, Mind the slew of questions, well into testing now and finding lots of issues. This one is two questions... :) I set a server to be unamaged in pacemaker while the server was running. Then I tried to remove the resource, and it refused saying it couldn't stop it, and to use '--force'. So I did, and the node got fenced. Now, the resource was setup with; pcs resource create srv07-el6 ocf:alteeve:server name="srv07-el6" \ meta allow-migrate="true" target-role="started" \ op monitor interval="60" start timeout="INFINITY" \ on-fail="block" stop timeout="INFINITY" on-fail="block" \ migrate_to timeout="INFINITY" I would have expected the 'stop timeout="INFINITY" on-fail="block"' to prevent fencing if the server failed to stop (question 1) and that if a resource was unmanaged, that the resource wouldn't even try to stop (question 2). Can someone help me understand what happened here? digimer More below; [root@el8-a01n01 ~]# pcs resource remove srv01-test Attempting to stop: srv01-test... Warning: 'srv01-test' is unmanaged Error: Unable to stop: srv01-test before deleting (re-run with --force to force deletion) [root@el8-a01n01 ~]# pcs resource remove srv01-test --force Deleting Resource - srv01-test [root@el8-a01n01 ~]# client_loop: send disconnect: Broken pipe As you can see, the node was fenced. The logs on that node were; Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-execd[1872]: warning: srv01-test_stop_0 process (PID 113779) timed out Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-execd[1872]: warning: srv01-test_stop_0[113779] timed out after 2ms Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-controld[1875]: error: Result of stop operation for srv01-test on el8-a01n01: Timed Out Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-controld[1875]: notice: el8-a01n01-srv01-test_stop_0:37 [ The server: [srv01-test] is indeed running. It will be shut down now.\n ] Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-attrd[1873]: notice: Setting fail-count-srv01-test#stop_0[el8-a01n01]: (unset) -> INFINITY Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-attrd[1873]: notice: Setting last-failure-srv01-test#stop_0[el8-a01n01]: (unset) -> 1610935435 Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-attrd[1873]: notice: Setting fail-count-srv01-test#stop_0[el8-a01n01]: INFINITY -> (unset) Jan 18 02:03:55 el8-a01n01.alteeve.ca pacemaker-attrd[1873]: notice: Setting last-failure-srv01-test#stop_0[el8-a01n01]: 1610935435 -> (unset) client_loop: send disconnect: Broken pipe On the peer node, the logs showed; Jan 18 02:03:13 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_IDLE -> S_POLICY_ENGINE Jan 18 02:03:13 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Calculated transition 58, saving inputs in /var/lib/pacemaker/pengine/pe-input-100.bz2 Jan 18 02:03:13 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 58 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-100.bz2): Complete Jan 18 02:03:13 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE Jan 18 02:03:18 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_IDLE -> S_POLICY_ENGINE Jan 18 02:03:18 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Calculated transition 59, saving inputs in /var/lib/pacemaker/pengine/pe-input-101.bz2 Jan 18 02:03:18 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: Transition 59 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-101.bz2): Complete Jan 18 02:03:18 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-controld[490050]: notice: State transition S_IDLE -> S_POLICY_ENGINE Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: warning: Detected active orphan srv01-test running on el8-a01n01 Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Clearing failure of srv01-test on el8-a01n02 because resource parameters have changed Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Removing srv01-test from el8-a01n01 Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Removing srv01-test from el8-a01n02 Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: * Stop srv01-test ( el8-a01n01 ) due to node availability Jan 18 02:03:35 el8-a01n02.alteeve.ca pacemaker-schedulerd[490049]: notice: Calculated transition 60, saving inputs in /var/lib/pacemaker/pengine/pe-input-102.bz2 Jan 18 02:03:35 el8-a01n02.
[ClusterLabs] Completely disabled resource failure triggered fencing
Hi all, I'm trying to figure out how to define a resource such that if it fails in any way, it will not cause pacemaker self self-fence. The reasoning being that there are relatively minor ways to fault a single resource (these are VMs, so for example, a bad edit to the XML definition renders it invalid, or the definition is accidentally removed). In a case like this, I fully expect that resource to enter a failed state. Of course, pacemaker won't be able to stop it, migrate it, etc. When this happens currently, it causes the host to self-fence, taking down all other hosted resources (servers). This is less than ideal. Is there a way to tell pacemaker that if it's unable to manage a resource, it flags it as failed and leaves it at that? I've been trying to do this and my config so far is; pcs resource create srv07-el6 ocf:alteeve:server name="srv07-el6" \ meta allow-migrate="true" target-role="stopped" \ op monitor interval="60" start timeout="INFINITY" \ on-fail="block" stop timeout="INFINITY" on-fail="block" \ migrate_to timeout="INFINITY" This is getting cumbersome and still, in testing, I'm finding cases where the node gets fenced when something breaks the resource in a creative way. Thanks for any insight/guidance! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Best way to obtain timestamp when node was set to "standby"
On 2020-10-20 8:23 p.m., Dirk Gassen wrote: > Hi Alltogether, > > What would be the best way to obtain since when a node is in the current > state, preferably from a script. > > For example, I can see that a node is in standby but I would like to > know when this happened without looking at logs or at individual files > in /var/lib/pacemaker/pengine/ > > Currently using pacemaker 1.1.18 and corosync 2.4.3. > > Dirk I don't think this is supported internally. Off hand, I would suggest you could look at the historical cib.xml files (the old versions are stored as cib.xml.XX, iirc). If you can find the file where the transition you're interested in happened, you could look at the files mtime to get an idea of when it happened. You could also parse the system logs, if the event you're curious about happened recently enough. Just a couple ideas, not sure how well they'd work in practice. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Maintenance mode status in CIB
On 2020-10-13 11:59 a.m., Strahil Nikolov wrote: > Also, it's worth mentioning that you can set the whole cluster in global > maintenance and power off the stack on all nodes without affecting your > resources. > I'm not sure if that is ever possible in node maintenance. > > Best Regards, > Strahil Nikolov Can you clarify what you mean by "power off the stack on all nodes"? Do you mean stop pacemaker/corosync/knet daemon themselves without issue? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Maintenance mode status in CIB
On 2020-10-13 5:41 a.m., Jehan-Guillaume de Rorthais wrote: > On Tue, 13 Oct 2020 04:48:04 -0400 > Digimer wrote: > >> On 2020-10-13 4:32 a.m., Jehan-Guillaume de Rorthais wrote: >>> On Mon, 12 Oct 2020 19:08:39 -0400 >>> Digimer wrote: >>> >>>> Hi all, >>> >>> Hi you, >>> >>>> >>>> I noticed that there appear to be a global "maintenance mode" >>>> attribute under cluster_property_set. This seems to be independent of >>>> node maintenance mode. It seemed to not change even when using >>>> 'pcs node maintenance --all' >>> >>> You can set maintenance-mode using: >>> >>> pcs property set maintenance-mode=true >>> >>> You can read about "maintenance-mode" cluster attribute and "maintenance" >>> node attribute in chapters: >>> >>> >>> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/s-cluster-options.html >>> >>> >>> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/_special_node_attributes.html >>> >>> I would bet the difference is that "maintenance-mode" applies to all nodes >>> in one single action. Using 'pcs node maintenance --all', each pcsd daemon >>> apply the local node maintenance independently. >>> >>> With the later, I suppose you might have some lag between nodes to actually >>> start the maintenance, depending on external factors. Moreover, you can >>> start/exit the maintenance mode independently on each nodes. >> >> Thanks for this. >> >> A question remains; Is it possible that: >> >> > name="maintenance-mode" value="false"/> >> >> Could be set, and a given node could be: >> >> >> >> >> >> >> >> That is to say; If the cluster is set to maintenance mode, does that >> mean I should consider all nodes to also be in maintenance mode, >> regardless of what their individual maintenance mode might be set to? > > I remember a similar discussion happening some months ago. I believe Ken > answered your question there: > > https://lists.clusterlabs.org/pipermail/developers/2019-November/002242.html > > The whole answer is informative, but the conclusion might answer your > question: > > >> There is some room for coming up with better option naming and meaning. > For > >> example maybe the cluster-wide "maintenance-mode" should be something > >> like "force-maintenance" to make clear it takes precedence over node and > >> resource maintenance. > > I understand here that "maintenance-mode" takes precedence over individual > node > maintenance mode. > > Regards, Very helpful, thank you kindly! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Maintenance mode status in CIB
On 2020-10-13 4:32 a.m., Jehan-Guillaume de Rorthais wrote: > On Mon, 12 Oct 2020 19:08:39 -0400 > Digimer wrote: > >> Hi all, > > Hi you, > >> >> I noticed that there appear to be a global "maintenance mode" >> attribute under cluster_property_set. This seems to be independent of >> node maintenance mode. It seemed to not change even when using >> 'pcs node maintenance --all' > > You can set maintenance-mode using: > > pcs property set maintenance-mode=true > > You can read about "maintenance-mode" cluster attribute and "maintenance" node > attribute in chapters: > > > https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/s-cluster-options.html > > > https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/_special_node_attributes.html > > I would bet the difference is that "maintenance-mode" applies to all nodes in > one single action. Using 'pcs node maintenance --all', each pcsd daemon apply > the local node maintenance independently. > > With the later, I suppose you might have some lag between nodes to actually > start the maintenance, depending on external factors. Moreover, you can > start/exit the maintenance mode independently on each nodes. Thanks for this. A question remains; Is it possible that: Could be set, and a given node could be: That is to say; If the cluster is set to maintenance mode, does that mean I should consider all nodes to also be in maintenance mode, regardless of what their individual maintenance mode might be set to? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Maintenance mode status in CIB
Hi all, I noticed that there appear to be a global "maintenance mode" attribute under cluster_property_set. This seems to be independent of node maintenance mode. It seemed to not change even when using 'pcs node maintenance --all' What is the difference between these attributes? Cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Avoiding self-fence on RA failure
On 2020-10-07 2:35 a.m., Ulrich Windl wrote: >>>> Digimer schrieb am 07.10.2020 um 05:42 in Nachricht > : >> Hi all, >> >> While developing our program (and not being a production cluster), I >> find that when I push broken code to a node, causing the RA to fail to >> perform an operation, the node gets fenced. (example below). > > (I see others have replied, too, but anyway) > Specifically it's the "stop" operation that may not fail. > >> >> This brings up a question; >> >> If a single resource fails for any reason and can't be recovered, but >> other resources on the node are still operational, how can I suppress a >> self-fence? I'd rather one failed resource than having all resources get >> killed (they're VMs, so restarting on the peer is ... disruptive). > > I think you can (on-fail=block (AFAIR). > Note: This is not a political statement for any near elections ;-) Indeed, and this works. I misunderstood the pcs syntax and applied the 'on-fail="stop"' to the monitor operation... Woops. >> If this is a bad approach (sufficiently bad to justify hard-rebooting >> other VMs that had been running on the same node), why is that? Are >> there any less-bad options for this scenario? >> >> Obviously, I would never push untested code to a production system, >> but knowing now that this is possible (losing a node with it's other VMs >> on an RA / code fault), I'm worried about some unintended "oops" causing >> the loss of a node. >> >> For example, would it be possible to have the node try to live migrate >> services to the other peer, before self-fencing in a scenario like this? > > As there is guarantee that migration will succeed without fencing the node it > could only be done with a timeout; otherwise the node will be hanging while > waiting for migration to succeed. I figured as much. >> Are there other options / considerations I might be missing here? >> >> example VM config: >> >> >> > type="server"> >> >> > value="srv07-el6"/> >> >> >> > name="allow-migrate" value="true"/> >> > name="migrate_to" value="INFINITY"/> >> > value="INFINITY"/> >> > name="target-role" value="Stopped"/> >> >> >> > name="migrate_from" timeout="600"/> >> > name="migrate_to" timeout="INFINITY"/> >> > name="monitor" on-fail="block"/> >> > name="notify" timeout="20"/> >> > name="start" timeout="30"/> >> > timeout="INFINITY"/> >> >> >> >> >> Logs from a code oops in the RA triggering a node self-fence; >> >> >> Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: >> srv07-el6_stop_0:36779:stderr [ DBD::Pg::db do failed: ERROR: syntax >> error at or near "3" ] >> Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: >> srv07-el6_stop_0:36779:stderr [ LINE 1: ...ut off, server_boot_time = 0 >> WHERE server_uuid = '3d73db4c-d... ] >> Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: >> srv07-el6_stop_0:36779:stderr [ >> ^ at /usr/share/perl5/Anvil/Tools/Database.pm line >> 13791. ] > > As I'm writing a lot of Perl code, too: Do you know "perl -c" to check the > syntax, BTW? > > And don't forget ocf-tester. ;-) I did not know about ocf-tester, thanks for the hint. As for 'perl -c', the issue above was caused by a bad SQL statement, don't think perl can catch that. :) -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Avoiding self-fence on RA failure
On 2020-10-07 2:20 a.m., Digimer wrote: > On 2020-10-07 1:49 a.m., Andrei Borzenkov wrote: >> 07.10.2020 06:42, Digimer пишет: >>> Hi all, >>> >>> While developing our program (and not being a production cluster), I >>> find that when I push broken code to a node, causing the RA to fail to >>> perform an operation, the node gets fenced. (example below). >>> >>> This brings up a question; >>> >>> If a single resource fails for any reason and can't be recovered, but >>> other resources on the node are still operational, how can I suppress a >>> self-fence? I'd rather one failed resource than having all resources get >>> killed (they're VMs, so restarting on the peer is ... disruptive). >>> >>> If this is a bad approach (sufficiently bad to justify hard-rebooting >>> other VMs that had been running on the same node), why is that? Are >>> there any less-bad options for this scenario? >>> >>> Obviously, I would never push untested code to a production system, >>> but knowing now that this is possible (losing a node with it's other VMs >>> on an RA / code fault), I'm worried about some unintended "oops" causing >>> the loss of a node. >>> >>> For example, would it be possible to have the node try to live migrate >>> services to the other peer, before self-fencing in a scenario like this? >>> Are there other options / considerations I might be missing here? >>> >>> example VM config: >>> >>> >>> >> type="server"> >>> >>> >> value="srv07-el6"/> >>> >>> >>> >> name="allow-migrate" value="true"/> >>> >> name="migrate_to" value="INFINITY"/> >>> >> value="INFINITY"/> >>> >> name="target-role" value="Stopped"/> >>> >>> >>> >> name="migrate_from" timeout="600"/> >>> >> name="migrate_to" timeout="INFINITY"/> >>> >> name="monitor" on-fail="block"/> >>> >> name="notify" timeout="20"/> >>> >> name="start" timeout="30"/> >>> >> timeout="INFINITY"/> >>> >>> >>> >>> >>> Logs from a code oops in the RA triggering a node self-fence; >>> >>> >>> Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: >>> srv07-el6_stop_0:36779:stderr [ DBD::Pg::db do failed: ERROR: syntax >>> error at or near "3" ] >> >> Only stop operation failure results in stonith by default, you can >> change it with on-fail operation attribute. The only other sensible >> value would be "block". > > Ah, it looks like I misunderstood how on-fail="block" works. I see in > the CIB it was only applied to the monitor action (which I probably > don't want, as I want it to recover if a monitor fails). > > I've changed the CIB to below, I'll see how this handles future code > oopses. > > Thanks! > > digimer > > >type="server"> > >value="srv07-el6"/> > > >name="allow-migrate" value="true"/> >name="migrate_to" value="INFINITY"/> >value="INFINITY"/> >name="target-role" value="stopped"/> > > >name="migrate_from" timeout="600"/> >name="migrate_to" timeout="INFINITY"/> >name="monitor"/> >name="notify" timeout="20"/> >name="start" on-fail="block" timeout="INFINITY"/> >on-fail="block" timeout="INFINITY"/> > > > Update, this worked! I faulted the RA and the server entered a FAILED state, no fencing. Thanks again! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Avoiding self-fence on RA failure
On 2020-10-07 1:49 a.m., Andrei Borzenkov wrote: > 07.10.2020 06:42, Digimer пишет: >> Hi all, >> >> While developing our program (and not being a production cluster), I >> find that when I push broken code to a node, causing the RA to fail to >> perform an operation, the node gets fenced. (example below). >> >> This brings up a question; >> >> If a single resource fails for any reason and can't be recovered, but >> other resources on the node are still operational, how can I suppress a >> self-fence? I'd rather one failed resource than having all resources get >> killed (they're VMs, so restarting on the peer is ... disruptive). >> >> If this is a bad approach (sufficiently bad to justify hard-rebooting >> other VMs that had been running on the same node), why is that? Are >> there any less-bad options for this scenario? >> >> Obviously, I would never push untested code to a production system, >> but knowing now that this is possible (losing a node with it's other VMs >> on an RA / code fault), I'm worried about some unintended "oops" causing >> the loss of a node. >> >> For example, would it be possible to have the node try to live migrate >> services to the other peer, before self-fencing in a scenario like this? >> Are there other options / considerations I might be missing here? >> >> example VM config: >> >> >> > type="server"> >> >> > value="srv07-el6"/> >> >> >> > name="allow-migrate" value="true"/> >> > name="migrate_to" value="INFINITY"/> >> > value="INFINITY"/> >> > name="target-role" value="Stopped"/> >> >> >> > name="migrate_from" timeout="600"/> >> > name="migrate_to" timeout="INFINITY"/> >> > name="monitor" on-fail="block"/> >> > name="notify" timeout="20"/> >> > name="start" timeout="30"/> >> > timeout="INFINITY"/> >> >> >> >> >> Logs from a code oops in the RA triggering a node self-fence; >> >> ==== >> Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: >> srv07-el6_stop_0:36779:stderr [ DBD::Pg::db do failed: ERROR: syntax >> error at or near "3" ] > > Only stop operation failure results in stonith by default, you can > change it with on-fail operation attribute. The only other sensible > value would be "block". Ah, it looks like I misunderstood how on-fail="block" works. I see in the CIB it was only applied to the monitor action (which I probably don't want, as I want it to recover if a monitor fails). I've changed the CIB to below, I'll see how this handles future code oopses. Thanks! digimer -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Avoiding self-fence on RA failure
Hi all, While developing our program (and not being a production cluster), I find that when I push broken code to a node, causing the RA to fail to perform an operation, the node gets fenced. (example below). This brings up a question; If a single resource fails for any reason and can't be recovered, but other resources on the node are still operational, how can I suppress a self-fence? I'd rather one failed resource than having all resources get killed (they're VMs, so restarting on the peer is ... disruptive). If this is a bad approach (sufficiently bad to justify hard-rebooting other VMs that had been running on the same node), why is that? Are there any less-bad options for this scenario? Obviously, I would never push untested code to a production system, but knowing now that this is possible (losing a node with it's other VMs on an RA / code fault), I'm worried about some unintended "oops" causing the loss of a node. For example, would it be possible to have the node try to live migrate services to the other peer, before self-fencing in a scenario like this? Are there other options / considerations I might be missing here? example VM config: Logs from a code oops in the RA triggering a node self-fence; Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: srv07-el6_stop_0:36779:stderr [ DBD::Pg::db do failed: ERROR: syntax error at or near "3" ] Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: srv07-el6_stop_0:36779:stderr [ LINE 1: ...ut off, server_boot_time = 0 WHERE server_uuid = '3d73db4c-d... ] Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: srv07-el6_stop_0:36779:stderr [ ^ at /usr/share/perl5/Anvil/Tools/Database.pm line 13791. ] Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: srv07-el6_stop_0:36779:stderr [ DBD::Pg::db do failed: ERROR: syntax error at or near "3" ] Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: srv07-el6_stop_0:36779:stderr [ LINE 1: ...ut off, server_boot_time = 0 WHERE server_uuid = '3d73db4c-d... ] Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-execd[33816]: notice: srv07-el6_stop_0:36779:stderr [ ^ at /usr/share/perl5/Anvil/Tools/Database.pm line 13791. ] Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-controld[33819]: notice: Result of stop operation for srv07-el6 on mk-a02n01: 1 (error) Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-controld[33819]: notice: mk-a02n01-srv07-el6_stop_0:51 [ DBD::Pg::db do failed: ERROR: syntax error at or near "3"\nLINE 1: ...ut off, server_boot_time = 0 WHERE server_uuid = '3d73db4c-d...\n ^ at /usr/share/perl5/Anvil/Tools/Database.pm line 13791.\nDBD::Pg::db do failed: ERROR: syntax error at or near "3"\nLINE 1: ...ut off, server_boot_time = 0 WHERE server_uuid = '3d73db4c-d...\n ^ at /usr/share/p Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-attrd[33817]: notice: Setting fail-count-srv07-el6#stop_0[mk-a02n01]: (unset) -> INFINITY Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-attrd[33817]: notice: Setting last-failure-srv07-el6#stop_0[mk-a02n01]: (unset) -> 1602041634 Connection to mk-a02n01.ifn closed by remote host. Connection to mk-a02n01.ifn closed. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: [EXT] Re: Determine a resource's current host in the CIB
On 2020-09-29 9:15 a.m., Ulrich Windl wrote: >>>> Christopher Lumens schrieb am 24.09.2020 um 17:48 in > Nachricht <801519843.20826233.1600962508662.javamail.zim...@redhat.com>: >> This is the kind of stuff I've been working on a lot, so hopefully I've >> added enough tools to make this easy to do. If not, I guess I've got >> more work to do. >> >> What's your time frame, are you lucky enough to be able to use the latest >> pacemaker releases, and are you going to use command line tools or the >> library? >> >>>> A quick look to the code shows that crm_resource ‑ where we would >>>> have ‑‑locate ‑does have the ‑‑xml‑file as well. But that seems >>>> not to do what I expected althoughI haven't looked into the details. >>> >>> This is an interesting option... I can see that it shows me running >>> resources only (not what resources are configured but off). It does show >>> more directly "VM x is on node y"; >> >> Coming eventually, crm_resource will support ‑‑output‑as=xml, which means >> you'd potentially get the output of "crm_resource ‑W" in a structured >> format you could easily parse. >> >> You could also use crm_mon and xmllint to do this right now on the >> command line. First you could use crm_mon to dump a single resource as >> XML: >> >> $ crm_mon ‑‑include=none,resources ‑‑resource dummy ‑‑output‑as xml >> >> >> role="Started" >> active="true" orphaned="false" blocked="false" managed="true" failed="false" > >> failure_ignored="false" nodes_running_on="1"> >> >> >> >> >> >> >> And then you construct an xpath query to get just the attribute you >> want: >> >> $ crm_mon ‑‑include=none,resources ‑‑resource dummy ‑‑output‑as xml | > xmllint ‑‑xpath >> '//resource/node/@name' ‑ >> name="cluster02" >> >> You could do the same kinds of queries using libxml2 once you've parsed >> stdout of crm_mon. We've got examples scattered throughout the >> pacemaker code, or I could probably try to remember how it's done. > > Hi! > > A few years ago I wrote a (Perl) tool that parses and linearizes the XML CIB. > For example using that tool to sort all resources by operation execution time, > the command was: > > ./pmkrstat.pl --cib-object=status --query '*:exec-time' \ > --attribute id=1 --attribute operation=2 --attribute on_node=3 \ > --attribute rc-code=4 --attribute queue-time=5 --attribute exec-time=6 \ > --attribute last-run=7 \ > --show full_path=0 --show name=0 --show attr_name=0 --no-handle-values | \ > sed -e 's/(//' -e 's/)//' -e 's/"//g' -e 's/,/ /g' | grep "$1" | sort -k6n > -k1 > > So on output you have 7 fields: id, operation, on_node, rc-code, queue-time, > exec-time, and last-run > > So an output line could be: > prm_db_last_0 start node-13 0 0 31858 1592836459 > > Ugly, isn't it? > > Regards, > Ulrich <3 perl Thanks for this, but I decided to use crm_mon's XML output to determine running resource location and state (building on the CIB parser I already had working). So I'm sorted now, but this would likely have been helpful had I not already. Cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Determine a resource's current host in the CIB
On 2020-09-24 5:00 a.m., Klaus Wenninger wrote: > On 9/24/20 9:19 AM, Reid Wahl wrote: >> **Directly via the CIB**, I don't see a more obvious way than looking >> for the most recent (perhaps by last-rc-change) successful >> (rc-code="0" or rc-code="8") monitor operation. That might be >> error-prone. I haven't looked into exactly how crm_simulate parses >> resource status from the CIB XML yet. Others on the list might know. >> >> Is there a particular reason why you need to parse the status directly >> from the CIB, as opposed to using other tools? Does your use case >> allow you to use crm_simulate with the cib.xml as input? (e.g., >> `crm_simulate --xml-file=`) > You might as well parse output of crm_mon using --xml-file. > > A quick look to the code shows that crm_resource - where we would > have --locate -does have the --xml-file as well. But that seems > not to do what I expected althoughI haven't looked into the details. This is an interesting option... I can see that it shows me running resources only (not what resources are configured but off). It does show more directly "VM x is on node y"; I'd like to avoid writing a new parser as I'll still need to read/parse the CIB anyway to know about off resources, and I have to assume there is a way to determine the same from the CIB itself. How to determine what is running where has to be determinable... digimer >> On Wed, Sep 23, 2020 at 11:04 PM Digimer wrote: >>> Hi all, >>> >>> I'm trying to parse the CIB to determine which node a given resource >>> (VM) is currently running on. I notice that the 'monitor' shows in both >>> node's status element (from when it last ran when the node previously >>> hosted the resource). >>> >>> https://pastebin.com/6RCMWdgq >>> >>> Specifically, I see under node 1 (the active host when the CIB was read): >>> >>> >> operation_key="srv07-el6_monitor_6" operation="monitor" >>> crm-debug-origin="do_update_resource" crm_feature_set="3.3.0" >>> transition-key="23:85:0:829209fd-35f2-4626-a9cd-f8a50a62871e" >>> transition-magic="0:0;23:85:0:829209fd-35f2-4626-a9cd-f8a50a62871e" >>> exit-reason="" on_node="mk-a02n01" call-id="76" rc-code="0" >>> op-status="0" interval="6" last-rc-change="1600925201" >>> exec-time="541" queue-time="0" >>> op-digest="65d0f0c9227f2593835f5de6c9cb9d0e"/> >>> >>> And under node 2 (hosted the server in the past): >>> >>> >> operation_key="srv07-el6_monitor_6" operation="monitor" >>> crm-debug-origin="do_update_resource" crm_feature_set="3.3.0" >>> transition-key="23:83:0:829209fd-35f2-4626-a9cd-f8a50a62871e" >>> transition-magic="0:0;23:83:0:829209fd-35f2-4626-a9cd-f8a50a62871e" >>> exit-reason="" on_node="mk-a02n02" call-id="61" rc-code="0" >>> op-status="0" interval="6" last-rc-change="1600925173" >>> exec-time="539" queue-time="0" >>> op-digest="65d0f0c9227f2593835f5de6c9cb9d0e"/> >>> >>> I don't see any specific entry in the CIB saying "resource X is >>> currently hosted on node Y", so I assume I should infer which node is >>> the current host? If so, should I look at which node's 'exec-time' is >>> higher, or which node has the higher 'call-id'? >>> >>> Or am I missing a more obvious way to tell what resource is running on >>> which node? >>> >>> -- >>> Digimer >>> Papers and Projects: https://alteeve.com/w/ >>> "I am, somehow, less interested in the weight and convolutions of >>> Einstein’s brain than in the near certainty that people of equal talent >>> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould >>> ___ >>> Manage your subscription: >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> ClusterLabs home: https://www.clusterlabs.org/ >> >> > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Determine a resource's current host in the CIB
On 2020-09-24 3:19 a.m., Reid Wahl wrote: > **Directly via the CIB**, I don't see a more obvious way than looking > for the most recent (perhaps by last-rc-change) successful > (rc-code="0" or rc-code="8") monitor operation. That might be > error-prone. I haven't looked into exactly how crm_simulate parses > resource status from the CIB XML yet. Others on the list might know. > > Is there a particular reason why you need to parse the status directly > from the CIB, as opposed to using other tools? Does your use case > allow you to use crm_simulate with the cib.xml as input? (e.g., > `crm_simulate --xml-file=`) I'm writing a parser for our next HA platform, and XML is a lot more machine-parsable that output meant for human consumption. Also, adding a layer between the data and the processor seems like an avoidable failure point (ie: what if crm_mon output changes format and breaks the regex used to pull data). Programs like crm_mon and pcs have a way of determining what is running where, and that's fundamentally what I am trying to do as well. digimer > On Wed, Sep 23, 2020 at 11:04 PM Digimer wrote: >> >> Hi all, >> >> I'm trying to parse the CIB to determine which node a given resource >> (VM) is currently running on. I notice that the 'monitor' shows in both >> node's status element (from when it last ran when the node previously >> hosted the resource). >> >> https://pastebin.com/6RCMWdgq >> >> Specifically, I see under node 1 (the active host when the CIB was read): >> >> > operation_key="srv07-el6_monitor_6" operation="monitor" >> crm-debug-origin="do_update_resource" crm_feature_set="3.3.0" >> transition-key="23:85:0:829209fd-35f2-4626-a9cd-f8a50a62871e" >> transition-magic="0:0;23:85:0:829209fd-35f2-4626-a9cd-f8a50a62871e" >> exit-reason="" on_node="mk-a02n01" call-id="76" rc-code="0" >> op-status="0" interval="6" last-rc-change="1600925201" >> exec-time="541" queue-time="0" >> op-digest="65d0f0c9227f2593835f5de6c9cb9d0e"/> >> >> And under node 2 (hosted the server in the past): >> >> > operation_key="srv07-el6_monitor_6" operation="monitor" >> crm-debug-origin="do_update_resource" crm_feature_set="3.3.0" >> transition-key="23:83:0:829209fd-35f2-4626-a9cd-f8a50a62871e" >> transition-magic="0:0;23:83:0:829209fd-35f2-4626-a9cd-f8a50a62871e" >> exit-reason="" on_node="mk-a02n02" call-id="61" rc-code="0" >> op-status="0" interval="6" last-rc-change="1600925173" >> exec-time="539" queue-time="0" >> op-digest="65d0f0c9227f2593835f5de6c9cb9d0e"/> >> >> I don't see any specific entry in the CIB saying "resource X is >> currently hosted on node Y", so I assume I should infer which node is >> the current host? If so, should I look at which node's 'exec-time' is >> higher, or which node has the higher 'call-id'? >> >> Or am I missing a more obvious way to tell what resource is running on >> which node? >> >> -- >> Digimer >> Papers and Projects: https://alteeve.com/w/ >> "I am, somehow, less interested in the weight and convolutions of >> Einstein’s brain than in the near certainty that people of equal talent >> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould >> ___ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Determine a resource's current host in the CIB
Hi all, I'm trying to parse the CIB to determine which node a given resource (VM) is currently running on. I notice that the 'monitor' shows in both node's status element (from when it last ran when the node previously hosted the resource). https://pastebin.com/6RCMWdgq Specifically, I see under node 1 (the active host when the CIB was read): And under node 2 (hosted the server in the past): I don't see any specific entry in the CIB saying "resource X is currently hosted on node Y", so I assume I should infer which node is the current host? If so, should I look at which node's 'exec-time' is higher, or which node has the higher 'call-id'? Or am I missing a more obvious way to tell what resource is running on which node? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 小型外贸公司为何订单源源不断;
Sorry all, I'm not sure how this spam got through. I may have mis-clicked when filtering the queue. digimer On 2020-09-17 12:15 p.m., hello wrote: > users您好 > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Triggering script on cib change
Is there a way to invoke a script when something happens with the cluster? Be it a simple transition, stonith action, resource dis/enable or resovery, etc? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] test, please ignore
Mail server test, please ignore -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Two-node Pacemaker cluster with "fence_aws" fence agent
On 2020-09-04 5:15 p.m., Philippe M Stedman wrote: > Hi ClusterLabs development, > > I am in the process of deploying a two-node cluster on AWS and using the > fence_aws fence agent for fencing. I was reading through the following > article about common pitfalls in configuring two-node Pacemaker clusters: > https://www.thegeekdiary.com/most-common-two-node-pacemaker-cluster-issues-and-their-workarounds/ > > and the only concern I have is regarding the fencing device. If I read > this correctly, there is no need to configure delayed fencing if the > fence device can guarantee serialized access.My question here is does > the fence_aws agent guarantee serialized access? In the event of a loss > of communication between the two cluster nodes, can I guarantee that one > host will win the race to fence the other and I won't end up in a > situation where both hosts get fenced. > > Do I need to implement delayed fencing with the fence_aws agent or not? > I appreciate any feedback. > > Thanks, > > *Phil Stedman* > Db2 High Availability Development and Support > Email: pmste...@us.ibm.com It would depend on AWS, and I don't believe it's a good idea to design a solution that depends on a third party's behaviour. There's another aspect of fence delays to consider as well; It's also to help ensure that the best node survives, not just that one of them does. So say your DB is running on node 1, you want to preferentially fence node 2. If, later, your DB moves to node 2, then you want to reconfigure your stonith devices to preferentially fence node 1. The delay parameter tells the agent to wait N seconds before fencing the associated node. So if your DB is on node 1, you would set the stonith device configuration that terminates node 1 to have, say, 'delay="15"'. This way, node 2 looks up how to fence node 1, sees the delay, and sleeps. Node 1 looks up how to fence node 2, sees no delay, and fences immediately. Node 2 is dead before the sleep exits, ensuring in a comms break where both nodes are otherwise OK that the node 1, the service host, lives. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Format of '--lifetime' in 'pcs resource move'
Hi all, Reading the pcs man page for the 'move' action, it talks about '--lifetime' switch that appears to control when the location constraint is removed; move [destination node] [--master] [life‐ time=] [--wait[=n]] Move the resource off the node it is currently running on by creating a -INFINITY location constraint to ban the node. If destination node is specified the resource will be moved to that node by creating an INFINITY loca‐ tion constraint to prefer the destination node. If --master is used the scope of the command is limited to the master role and you must use the promotable clone id (instead of the resource id). If lifetime is specified then the constraint will expire after that time, other‐ wise it defaults to infinity and the constraint can be cleared manually with 'pcs resource clear' or 'pcs con‐ straint delete'. If --wait is specified, pcs will wait up to 'n' seconds for the resource to move and then return 0 on success or 1 on error. If 'n' is not speci‐ fied it defaults to 60 minutes. If you want the resource to preferably avoid running on some nodes but be able to failover to them use 'pcs constraint location avoids'. I think I want to use this, as we move resources manually for various reasons where the old host is still able to host the resource should a node failure occur. So we'd love to immediately remove the location constraint as soon as the move completes. I tries using '--lifetime=60' as a test, assuming the format was 'seconds', but that was invalid. How is this switch meant to be used? Cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Beginner Question about VirtualDomain
On 2020-08-17 8:40 a.m., Sameer Dhiman wrote: > Hi, > > I am a beginner using pacemaker and corosync. I am trying to set up > a cluster of HA KVM guests as described by Alteeve wiki (CentOS-6) but > in CentOS-8.2. My R&D setup is described below > > Physical Host running CentOS-8.2 with Nested Virtualization > 2 x CentOS-8.2 guest machines as Cluster Node 1 and 2. > WinXP as a HA guest. > > 1. drbd --> dlm --> lvmlockd --> LVM-activate --> gfs2 (guest machine > definitions) > 2. drbd --> dlm --> lvmlockd --> LVM-activate --> raw-lv (guest machine HDD) > > Question(s): > 1. How to prevent guest startup until gfs2 and raw-lv are available? In > CentOS-6 Alteeve used autostart=0 in the tag. Is there any similar > option in pacemaker because I did not find it in the documentation? > > 2. Suppose, If I configure constraint order gfs2 and raw-lv then guest > machine. Stopping the guest machine would also stop the complete service > tree so how can I prevent this? > > -- > Sameer Dhiman Hi Sameer, I'm the author of that wiki. It's quite out of date, as you noted, and we're actively developing a new release for EL8. Though, it would be ready until near the end if the year. There are a few changes we've made that you might want to consider; 1. We never were too happy with DLM, and so we've reworked things to no longer need it. So we use normal LVM backing DRBD resources. One resource per VM, on volume per virtual disk backed by an LV. Our tools will automate this, but you can easily enough manually create them if your environment is fairly stable. 2. To get around GFS2, we create a /mnt/shared/{provision,definitions,files,archive} directory (note /shared -> /mnt/shared to be more LFS friendly). We'll again automate management of files in Striker, but you can copy the files manually and rsync out changes as needed (again, if your environment doesn't change much). 3. We changed DRBD from v8.4 to 9.0, and this meant a few things had to change. We will integrate support for short-throw DR hosts (async "third node" in DRBD that is outside pacemaker). We run the resources to only allow a single primary normally and enable auto-promote. For live-migration, we temporarily enable live migration, promote the target, migrate, demote the old host and disable dual-primary. This makes it safer as it's far less likely that someone could accidentally start a VM on the passive node (not that it ever happened as our tools prevented it, but it was _possible_, so we wanted to improve that). That handle #3, we've written our own custom RA (ocf:alteeve:server [1]). This RA is smart enough to watch/wait for things to become ready before starting. It also handles the DRBD stuff I mentioned, and the virsh call to do the migration. So it means the pacemaker config is extremely simple. Note though it depends on the rest of our tools so it won't work outside the Anvil!. That said, if you wanted to use it before we release Anvil! M3, you could probably adapt it easily enough. If you have any questions, please let me know and I'll help as best I can. Cheers, digimer (Note: during development, this code base is kept outside of Clusterlabs. We'll move it in when it reaches beta). 1. https://github.com/digimer/anvil/blob/master/ocf/alteeve/server -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Warning; EL6 kernel 2.6.32-754.28.1 breaks bonding!
On 2020-06-23 5:59 p.m., Hayden,Robert wrote: > >> -Original Message- >> From: Users On Behalf Of Digimer >> Sent: Monday, April 27, 2020 5:12 PM >> To: Cluster Labs - Users >> Subject: [ClusterLabs] Warning; EL6 kernel 2.6.32-754.28.1 breaks bonding! >> >> I've confirmed that active-passive bonding is very broken on the latest >> EL6 kernel, 2.6.32-754.28.1. It works fine on 2.6.32-754.27.1 though. >> >> I've opened up an RHBZ (#1828604), but it's auto-set to private. Copy of >> the bug is in CentOS #17292 >> (https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbug >> s.centos.org%2Fview.php%3Fid%3D17292&data=02%7C01%7Crhayden >> %40cerner.com%7C7e1191ac963a4ac80cea08d7eaf81a8b%7Cfbc493a80d2444 >> 54a815f4ca58e8c09d%7C0%7C0%7C637236223640212327&sdata=hVfkw >> 7ovswop3QLqcoindEL8BB3zJF%2FEHlHWlWe8WbE%3D&reserved=0) >> >> In short, if you use bonding and have EL6 nodes, do not upgrade to .28, >> and if you are already on .28, downgrade to .27. > > Following up. For all of us with clusters still using Linux 6.10 with > active/backup bonding, the wait is over with the release of .30.2 kernel. > > For Red Hat Enterprise Linux systems, see RHSA-2020:2430 > For Oracle Linux systems, see ELSA-2020-2430 > > We have tested this on HPE hardware with network switch port disablement. > Looks like the correction works. > > Robert I had tested it before the bug was closed successfully, and tested the CentOS variant kernel as, as expected, it works fine as well. digimer >> Howto: >> >> # yum install kernel-2.6.32-754.27.1.el6.x86_64 >> kernel-devel-2.6.32-754.27.1.el6.x86_64 >> kernel-headers-2.6.32-754.27.1.el6.x86_64 >> >> Edit '/boot/grub/grub.conf' (or '/boot/efi/EFI/redhat/grub.conf') and >> change 'default=X' to the .27.1 kernel entry, reboot. After reboot, >> remove .28.1; >> >> -- >> Digimer >> Papers and Projects: >> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Falte >> eve.com%2Fw%2F&data=02%7C01%7Crhayden%40cerner.com%7C7e1 >> 191ac963a4ac80cea08d7eaf81a8b%7Cfbc493a80d244454a815f4ca58e8c09d%7 >> C0%7C0%7C637236223640222320&sdata=Yt5Uo3VhOf2ICNen8gMcPVuY >> nyN7amYtQWEDAYlY%2B00%3D&reserved=0 >> "I am, somehow, less interested in the weight and convolutions of >> Einstein’s brain than in the near certainty that people of equal talent >> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould >> ___ >> Manage your subscription: >> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists. >> clusterlabs.org%2Fmailman%2Flistinfo%2Fusers&data=02%7C01%7Crha >> yden%40cerner.com%7C7e1191ac963a4ac80cea08d7eaf81a8b%7Cfbc493a80 >> d244454a815f4ca58e8c09d%7C0%7C0%7C637236223640222320&sdata=Y >> FAO9G58j4GH5fHMDQPYgGdPJ6iK9WnbnrcqbCSjYE4%3D&reserved=0 >> >> ClusterLabs home: >> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww >> w.clusterlabs.org%2F&data=02%7C01%7Crhayden%40cerner.com%7C7 >> e1191ac963a4ac80cea08d7eaf81a8b%7Cfbc493a80d244454a815f4ca58e8c09d >> %7C0%7C0%7C637236223640222320&sdata=PLT58H%2BwAc3ttv2oeceW >> 9QUMVT6deMUD6FFu25d%2BfPs%3D&reserved=0 > > > CONFIDENTIALITY NOTICE This message and any included attachments are from > Cerner Corporation and are intended only for the addressee. The information > contained in this message is confidential and may constitute inside or > non-public information under international, federal, or state securities > laws. Unauthorized forwarding, printing, copying, distribution, or use of > such information is strictly prohibited and may be unlawful. If you are not > the addressee, please promptly delete this message and notify the sender of > the delivery error by e-mail or you may call Cerner's corporate offices in > Kansas City, Missouri, U.S.A at (+1) (816)221-1024. > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Warning; EL6 kernel 2.6.32-754.28.1 breaks bonding!
I've confirmed that active-passive bonding is very broken on the latest EL6 kernel, 2.6.32-754.28.1. It works fine on 2.6.32-754.27.1 though. I've opened up an RHBZ (#1828604), but it's auto-set to private. Copy of the bug is in CentOS #17292 (https://bugs.centos.org/view.php?id=17292) In short, if you use bonding and have EL6 nodes, do not upgrade to .28, and if you are already on .28, downgrade to .27. Howto: # yum install kernel-2.6.32-754.27.1.el6.x86_64 kernel-devel-2.6.32-754.27.1.el6.x86_64 kernel-headers-2.6.32-754.27.1.el6.x86_64 Edit '/boot/grub/grub.conf' (or '/boot/efi/EFI/redhat/grub.conf') and change 'default=X' to the .27.1 kernel entry, reboot. After reboot, remove .28.1; -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] New APC AP7900 switch PDU and fencing
Hi all, We've been using the APC AP7900 switched PDUs as backup fence devices forever and a day. The latest units have a new firmware that disables web and SNMP access by default. This breaks fence_apc_snmp of course. We figured out out to configure it for use as a fence device. Here's the instructions in case it helps others later; https://www.alteeve.com/w/Configuring_an_APC_AP7900 In short (run in the serial terminal); # Set the IP tcpip -i 10.20.2.1 -s 255.255.0.0 -g 10.20.255.254 # Enable the web (http and https) interfaces; web -h enable web -s enable # Enable SNMP snmp -S enable -c1 private -a1 writeplus snmp -S enable -c2 public -a2 writeplus NOTE: This config assumes an isolated network. You will want to tune the configuration if you are worried about access (ie: set snmp user/password, change 'public' to 'read', etc). I leave that as an exercise for the reader. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] Anvil! M2 v2.0.8 released
Alteeve is proud to announce the latest (and likely final) Anvil! M2 release; Version 2.0.8. https://github.com/ClusterLabs/striker/releases/tag/v2.0.8 This is a major release over v2.0.7, all users are strongly urged to upgrade. Main Feature Upgrades; * Major overhaul to striker-installer. It can now run against a standard minimal OS install. Stage-1 install from ISO/USB is no longer required (though still supported). This allows pure UEFI systems to used for Striker dashboards. * Added a new scan-hardware that tracks RAM DIMMs and CSS LEDs state. * Updated the DB archive trigger values to improve DB performance. * Modified scan-storcli to no longer set a health score against node until/unless a drive's media or other error counts exceed 5. * Added support for Windows 2019 and improved handling of Win10 and Win2016 guests. * Created 'anvil-rehome-server' to handle migrating hosted servers between Anvil! node pairs. * Created anvil-report-usage and created a stand-alone variant to show at the command line what resources each server uses and what resources are available still. Main Bugs Fixed * When archiving databases, it's possible that all entries for a given UUID will be purged from the history schema, leaving a record in public without it's history pair. The broke scan agents and ScanCore itself when trying to resync. This has been fixed. * Added a new 'anvil-node-suicide' that will terminate a node that begins a shutdown and hangs (ie: because of a DLM hang). * Added a new 'fence_delay' fence agent that always fails when asked to fence. This agent ensures that fence_ipmilan will not be called before the BMC has time to reboot in a case where the PDUs killed the power to the node but the agent reported a failure. * Fixed a bug in scan-storcli where disks with EID:SID of 0 were being missed. * Fixed numerous small bugs in all sections of the Anvil!. This is expected to be the final m2 release. All active development now switches to M3 and only critical bugs will be fixed going forward. See you all on EL8! NOTE: M3 development is happening outside the Clusterlabs repo. It will be moved over once it reaches beta. For those wishing to follow it during alpha, the repo is: https://github.com/digimer/anvil -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] NFS in different subnets
On 2020-04-18 2:48 a.m., Strahil Nikolov wrote: > On April 18, 2020 8:43:51 AM GMT+03:00, Digimer wrote: >> For what it's worth; A lot of HA specialists spent a lot of time trying >> to find the simplest _reliable_ way to do multi-site/geo-replicated HA. >> I am certain you'll find a simpler solution, but I would also wager >> that >> when it counts, it's going to let you down. >> >> The only way to make things simpler is to start making assumptions, and >> if you do that, at some point you will end up with a split-brain (both >> sites thinking the other is gone and trying to take the primary role) >> or >> both sites will think the other is running, and neither will be. Add >> shared storage to the mix, and there's a high chance you will corrupt >> data when you need it most. >> >> Of course, there's always a chance you'll come up with a system no one >> else has thought of, just be aware of what you know and what you don't. >> HA is fun, in big part, because it's a challenge to get right. >> >> digimer >> > > I don't get something. > > Why this cannot be done? > > One node is in siteA, one in siteB , qnet on third location.Routing between > the 2 subnets is established and symmetrical. > Fencing via IPMI or SBD (for example from a HA iSCSI cluster) is configured > > The NFS resource is started on 1 node and a special RA is used for the DNS > records. If node1 dies, the cluster will fence it and node2 will power up > the NFS and update the records. > > Of course, updating DNS only from 1 side must work for both sites. > > Best Regards, > Strahil Nikolov It comes down to differentiating between a link loss to a site versus the destruction/loss of the site. In either case, you can't fence the lost node, so what do you do? If you decide that you don't need to fence it, then you face all the issues of any other normal cluster with broken or missing fencing. It's just a question of time before you assume wrong and end up with a split brain / data divergence / data loss. The reason that Booth has been designed the way it has solves this problem by having "a cluster of clusters". If a site is lost because of a comms break, you can trust the cluster at the site to act in a predictable way. This is only possible because that site is a self-contained HA cluster, so it can be confidently assumed that it will shut down services when it loses contact with the peer and quorum sites. The only safe way to operate without this setup over a stretch cluster is to accept that a comms loss or site loss hangs the cluster until a human intervenes, but then, that's not really HA now. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] NFS in different subnets
For what it's worth; A lot of HA specialists spent a lot of time trying to find the simplest _reliable_ way to do multi-site/geo-replicated HA. I am certain you'll find a simpler solution, but I would also wager that when it counts, it's going to let you down. The only way to make things simpler is to start making assumptions, and if you do that, at some point you will end up with a split-brain (both sites thinking the other is gone and trying to take the primary role) or both sites will think the other is running, and neither will be. Add shared storage to the mix, and there's a high chance you will corrupt data when you need it most. Of course, there's always a chance you'll come up with a system no one else has thought of, just be aware of what you know and what you don't. HA is fun, in big part, because it's a challenge to get right. digimer On 2020-04-17 4:43 p.m., Daniel Smith wrote: > We only have 1 cluster per site so adding additional hardware is not > optimal. I feel like I'm trying to use a saw where an axe would be the > proper tool. I thank you or your time, but it appears that it may be > best for me to write something from scratch for the monitoring and > controlling of the failover rather than try and force pacemaker to do > something it was not built for. > > **Daniel Smith > Network Engineer > **15894 Diplomatic Plaza Dr | Houston, TX 77032 > P: 281-233-8487 | M: 832-301-1087 > daniel.sm...@craneww.com <mailto:daniel.sm...@craneww.com> > <https://craneww.com/> > > > -Original Message- > From: Digimer [mailto:li...@alteeve.ca] > Sent: Friday, April 17, 2020 2:38 PM > To: Daniel Smith ; Cluster Labs - Users > > Subject: Re: NFS in different subnets > > EXTERNAL SENDER: Use caution with links/attachments. > > > On 2020-04-17 3:20 p.m., Daniel Smith wrote: >> Thank you digimer, and I apologize for getting the wrong email. >> >> >> >> Booth was the piece I was missing. Have been researching setting that >> up and finding a third location for quorum. From what I have found, I >> believe I will need to setup single node pacemaker clusters at each > > No, each sites needs to be a proper cluster (2 nodes minimum). The idea > is that, if the link to the building is lost, the cluster at the lost > site will shut down. With only one node, a hung node (that might recover > later) could recover and think it could still do things before it > realized it shouldn't. Booth is "a cluster of clusters". > > The nodes at each site should be on different hardware, for the same > reason. It is very much NOT a waste of resources (and, of course, use > proper, tested STONITH/fencing). > >> datacenter to use with booth. Since we have ESX clusters at each site >> which has its own redundancies built in, building redundant nodes at >> each site is pretty much a waste of resources imho. I have 2 questions >> about this setup though: >> >> 1. If I setup pacemaker with a single node an no virtual IP, is >> there any problems I need to be aware of? > > Yes, see above. > >> 2. Is drbd the best tool for the data sync between the sites? I've >> looked at drbd proxy, but I get the sense that it's not open source, >> or would rsync with incrond be a better option? > > DRBD would work, but you have to make a choice; If you run synchronous > so that data is never lost (writes are confirmed when they hit both > sites), then your disk latency/bandwidth is your network > latency/bandwidth. Otherwise, you run asychronous but you'll lose any > data that didn't get transmitted before a site is lost. > > As for proxy; Yes, it's a commercial add-on. If protocol A (async) > replication can't buffer the data to be transmitted (because the data is > changing faster than it can be flushed out), DRBD proxy provides a > system to have a MUCH larger send cache. It's specifically designed for > long-throw asynchronous replication. > >> I already made a script that executes with the network startup that >> updates DNS using nsupdate so that should be easy to create a resource >> based on it I would think. > > Yes, RAs are fairly simple to write. See: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ClusterLabs_OCF-2Dspec_blob_master_ra_1.0_resource-2Dagent-2Dapi.md&d=DwIF-g&c=Wu-45Ur27gGDpxsjlug4Cg&r=q-8ZpsKaCLRx2o0mEXxy3Wwikv-bFgowvCmzCB6rH1g&m=yidl5gIjUPBx1vNdS5f1AytCDGDNSc-d6sEoth4tJzM&s=LvXxFjX_zy6Agy-HIyocIlL4G8OJnm9QYktJtWiUL6M&e= > > digimer > > > -- > Digimer > Papers and Projects: > https://urldefense.proofpoint.com/v2/url?u=https
Re: [ClusterLabs] NFS in different subnets
On 2020-04-17 3:20 p.m., Daniel Smith wrote: > Thank you digimer, and I apologize for getting the wrong email. > > > > Booth was the piece I was missing. Have been researching setting that > up and finding a third location for quorum. From what I have found, I > believe I will need to setup single node pacemaker clusters at each No, each sites needs to be a proper cluster (2 nodes minimum). The idea is that, if the link to the building is lost, the cluster at the lost site will shut down. With only one node, a hung node (that might recover later) could recover and think it could still do things before it realized it shouldn't. Booth is "a cluster of clusters". The nodes at each site should be on different hardware, for the same reason. It is very much NOT a waste of resources (and, of course, use proper, tested STONITH/fencing). > datacenter to use with booth. Since we have ESX clusters at each site > which has its own redundancies built in, building redundant nodes at > each site is pretty much a waste of resources imho. I have 2 questions > about this setup though: > > 1. If I setup pacemaker with a single node an no virtual IP, is > there any problems I need to be aware of? Yes, see above. > 2. Is drbd the best tool for the data sync between the sites? I’ve > looked at drbd proxy, but I get the sense that it’s not open source, or > would rsync with incrond be a better option? DRBD would work, but you have to make a choice; If you run synchronous so that data is never lost (writes are confirmed when they hit both sites), then your disk latency/bandwidth is your network latency/bandwidth. Otherwise, you run asychronous but you'll lose any data that didn't get transmitted before a site is lost. As for proxy; Yes, it's a commercial add-on. If protocol A (async) replication can't buffer the data to be transmitted (because the data is changing faster than it can be flushed out), DRBD proxy provides a system to have a MUCH larger send cache. It's specifically designed for long-throw asynchronous replication. > I already made a script that executes with the network startup that > updates DNS using nsupdate so that should be easy to create a resource > based on it I would think. Yes, RAs are fairly simple to write. See: https://github.com/ClusterLabs/OCF-spec/blob/master/ra/1.0/resource-agent-api.md digimer -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] NFS in different subnets
Hi Daniel, You sent this to clusterlabs owners, instead of users. I've changed the CC to send it to the list for a larger discussion. The biggest problem with stretch clustering is knowing the difference between a link fault and a site loss. Pacemaker Booth was designed to solve this problem by using a "cluster of clusters". The logic being that if a site is lost, an arbiter node at a third site can decide which site should live, and trust the lost side will either behave sensibly (because it is itself a cluster), or it's destroyed. After this, it just becomes a question of implementation details. Have the master side update a DNS entry should be fine (though you may need to write a small resource agent to do it, not sure if one exists for DNS yet). digimer On 2020-04-17 11:44 a.m., Daniel Smith wrote: I have been searching how to customize pacemaker to manage NFS servers in separate datacenters, but I am finding older data that suggests this is a bad idea and not much information about how to customize it to do this without the 1 IP being moved back and forth. If this isn’t the best tool, please let me know, but here is the setup I am trying to do if someone can help point me to some information on how the best way is to do this. Server 1: DC01-NFS01, 10.0.1.10/24 Server 2: DC02-NFS01, 10.0.2.10/24 NFS share: nfs01.domain.local:/opt/nfsmounts using drbd to sync between datacenters DC01 to DC02 has a 2Gb layer 2 connection between the datacenters I would like to have pacemaker manage the NFS services on both systems in an active/passive setup where it updates the DNS servers with the active server’s IP for nfs01.domain.local. Eventually, we will have a virtual switch in VMWare that I would like pacemaker to update, but for now, the delay in DNS updates will be acceptable for failover. Thank you in advance for any help you can provide. Daniel Smith Network Engineer 15894 Diplomatic Plaza Dr | Houston, TX 77032 P: 281-233-8487 | M: 832-301-1087 daniel.sm...@craneww.com -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Solidarity during these extraordinary times
On 2020-03-18 1:17 p.m., Ken Gaillot wrote: > Hi everybody, > > I know that most if not all of the people on this list are working > under extreme pressures right now. > > I wanted to acknowledge that and say we're all in this together, and > that what we are doing is more important than ever. As online platforms > of all kinds experience a sudden surge in usage, providing stable > service is important for many reasons. > > High availability can't add capacity where none exists, but it is a > crucial link in the chain. I am proud of all of you for continuing to > work as best you can under the circumstances. No matter how small your > part may feel at the time, it is important. > > Open source has always been about more than creating and using > software. It is about community, and how we can accomplish more > together. > > Wishing you and your loved ones the best, > <3 -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] A note for digimer re: qdevice documentation
On 2020-02-05 5:07 a.m., Steven Levine wrote: > I'm having some trouble re-registering for the Clusterlabs IRC channel > but this might get to you. > > Red Hat's overview documentation of qdevice (quorum device when spelled > out in the doc) is here: > > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/configuring_and_managing_high_availability_clusters/index#assembly_configuring-quorum-devices-configuring-cluster-quorum > > > Steven Thanks! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] 2020 Summit is right around the corner!
I pasted my password into fpaste/IRC a month ago. So... ya. :) digimer On 2019-12-02 2:24 p.m., Steven Levine wrote: > I did *not* mean this message to go to the whole list. My profuse apologies. > > I haven't made this rookie mistake in a decade or more. > > Sorry, > > Steven > > - Original Message - > From: "Steven Levine" > To: "Cluster Labs - All topics related to open-source clustering welcomed" > > Sent: Monday, December 2, 2019 1:23:44 PM > Subject: Re: [ClusterLabs] 2020 Summit is right around the corner! > > Ken: > > I plan to be there. I wasn't sure if I needed to let you know specifically. > > I got budget approval last fall when we thought this would happen in > November, but my department asked me to withdraw that request and re-apply > this quarter when the date was changed. And then they told me we had no more > money in the travel budget, so my manager has asked Chris if the budget could > come from his department. > > But I plan to come anyway, even if I have to pay my own way again. It's much > too important to my ability to do my job well for me to miss this if I can > possibly attend. > > Steven > > - Original Message - > From: "Ken Gaillot" > To: users@clusterlabs.org > Sent: Monday, December 2, 2019 12:43:01 PM > Subject: [ClusterLabs] 2020 Summit is right around the corner! > > The 2020 ClusterLabs summit is only two months away! Details are > available at: > > http://plan.alteeve.ca/index.php/HA_Cluster_Summit_2020 > > So far we have responses from Alteeve, Canonical, IBM MQ, NTT, Proxmox, > Red Hat, and SUSE. If anyone else thinks they might attend, please > reply here or email me privately so we can firm up the head count and > finalize planning. > > Thanks, > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Final Pacemaker 2.0.3 release now available
On 2019-11-27 7:27 p.m., Ken Gaillot wrote: > On Mon, 2019-11-25 at 23:02 -0500, Digimer wrote: >> Congrats! >> >> Can I ask, when might fencing become required? Is that still in the >> works, or has it been shelved? >> >> digimer > > tl;dr shelved > > The original plan for 2.0.0 was to get rid of the stonith-enabled flag, > but still allow disabling stonith via "requires=quorum" in > rsc_defaults. > > Certain resources, such as stonith devices themselves or simple nagios > checks, can be immediately started elsewhere even if their original > node needs to be fenced. requires=quorum was designed for such > resources. Setting that for all resources would make fencing largely > irrelevant. > > That was shelved when I realized the code would have to be considerably > more complicated to go that route. Also, someone could theoretically > want "requires=quorum" for all resources while still wanting nodes to > be fenced if they are lost. Thanks for the update. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Final Pacemaker 2.0.3 release now available
Congrats! Can I ask, when might fencing become required? Is that still in the works, or has it been shelved? digimer On 2019-11-25 9:32 p.m., Ken Gaillot wrote: > Hi all, > > The final release of Pacemaker version 2.0.3 is now available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.3 > > Highlights include: > > * A dynamic cluster recheck interval (you don't have to care about > changing cluster-recheck-interval when using failure-timeout or most > rules) > > * Pacemaker Remote options for security hardening (listen address and > TLS priorities) > > * crm_mon supports the --output-as/--output-to options, has some tweaks > to text and HTML output that will hopefully make it easier to read, has > a correct count of disabled and blocked resources, and supports an > option to set a stylesheet for HTML output > > * A new fence-reaction cluster option controls whether the local node > stops pacemaker or panics the local host when notified of its own > fencing (which can happen with fabric fencing agents such as > fence_scsi) > > * Documentation improvements include a new chapter about ACLs > (replacing an outdated text file) in "Pacemaker Explained" and another > one about the command-line tools in "Pacemaker Administration": > > https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#idm47160746093920 > > https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Administration/index.html#idm47051359032720 > > As usual, there were bug fixes and log message improvements as well. > Most significantly, a regression introduced in 2.0.2 that effectively > disabled concurrent-fencing has been fixed, and an invalid transition > (blocking all further resource actions) has been fixed when both a > guest node or bundle and the host running it needs to be fenced, but > can't (due to quorum loss, for example). > > For more details about changes in this release, see: > > https://github.com/ClusterLabs/pacemaker/blob/2.0/ChangeLog > > Many thanks to all contributors of source code to this release, > including Aleksei Burlakov, Chris Lumens, Gao,Yan, Hideo Yamauchi, Jan > Pokorný, John Eckersberg, Kazunori INOUE, Ken Gaillot, Klaus Wenninger, > Konstantin Kharlamov, Munenari, Roger Zhou, S. Schuberth, Tomas > Jelinek, and Yuusuke Iida. > > Version 1.1.22, with selected backports from this release, will also be > released soon. > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Announcing ClusterLabs Summit 2020
On 2019-11-11 8:21 a.m., Thomas Lamprecht wrote: > On 11/5/19 3:07 AM, Ken Gaillot wrote: >> Hi all, >> >> A reminder: We are still interested in ideas for talks, and rough >> estimates of potential attendees. "Maybe" is perfectly fine at this >> stage. It will let us negotiate hotel rates and firm up the location >> details. > > Maybe we (Proxmox) could also come, Vienna isn't to far away, after > all.. If interested I could do a small talk about our knet/corosync + > multi-master clustered configuration filesystem + HA stack, if there's > interest at all :) > > cheers, > Thomas Certainly! The more the merrier. :) -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Announcing ClusterLabs Summit 2020
On 2019-11-07 11:41 p.m., Keisuke MORI wrote: > Hi, > > 2019年11月5日(火) 11:08 Ken Gaillot : >> >> Hi all, >> >> A reminder: We are still interested in ideas for talks, and rough >> estimates of potential attendees. "Maybe" is perfectly fine at this >> stage. It will let us negotiate hotel rates and firm up the location >> details. >> > > I would like to join the Summit. 2 from NTT will be there, myself and > one more person. > > I don't have a specific topic to talk right now, but possibly I can talk > about a > PostgreSQL 12 support of the pgsql resource agent, or share our test results > and the issues from a user's point of view. > > Look forward to seeing you guys. > Thanks, Woohoo! Looking forward to seeing you again. :) -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: Re: Announcing ClusterLabs Summit 2020
On 2019-11-07 4:23 a.m., Jehan-Guillaume de Rorthais wrote: > Hi, > > On Wed, 06 Nov 2019 09:41:36 -0600 > Ken Gaillot wrote: > >> This topic sounds promising. Maybe we could do a round table where 3 or >> 4 people give 15-minute presentations about their technique? >> >> Jehan-Guillaume, Damien, Ulrich, would you possibly be interested in >> participating? I realize it's early to make any firm commitments, but >> we could start considering the possibilities. > > I'm not sure I'll come to the summit. Even if I would be glad to meet this > community IRL. Not sure about my availability and how I could contribute to > the summit. It's a long road from France for a very limited contribution :/ > > However, I am highly interested in this subject so I wish I can participate in > some other way. > > Moreover, as a RA maintainer, I still have to gather a list of itchy things I > could share as well. We can then decide what is the best way to discuss them. > Maybe the mailing list would be enough. I can't speak to travel, but I can say that the more people attend, the more complete discussions will be. Being an attendee who listens and gives feedback is, itself, well worth it. So if you're on the fence, come. The more people who attend, the better for all. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Announcing ClusterLabs Summit 2020
On 2019-11-05 12:21 p.m., Rafael David Tinoco wrote: > On 04/11/2019 23:07, Ken Gaillot wrote: >> Hi all, >> >> A reminder: We are still interested in ideas for talks, and rough >> estimates of potential attendees. "Maybe" is perfectly fine at this >> stage. It will let us negotiate hotel rates and firm up the location >> details. > > Hello. This is Rafael, from Canonical. I'm currently in charge of the HA > stack in Ubuntu (and helping debian-ha-maintainers), specially now for > 20.04 LTS release. > > I wonder if I could participate in the event and, possibly, have a slot > to share our experience and the work being done, specially related to > testing in other architectures, like arm64 and s390x. > > I also hope this opportunity can make us closer to upstream, so we can > start contributing more w/ patches and fixes. > > Thank you! You're absolutely welcome! :) -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Announcing ClusterLabs Summit 2020
On 2019-11-05 10:09 a.m., Ken Gaillot wrote: > On Tue, 2019-11-05 at 00:39 -0500, Digimer wrote: >> On 2019-11-04 9:07 p.m., Ken Gaillot wrote: >>> Hi all, >>> >>> A reminder: We are still interested in ideas for talks, and rough >>> estimates of potential attendees. "Maybe" is perfectly fine at this >>> stage. It will let us negotiate hotel rates and firm up the >>> location >>> details. >> >> I will be there. I would like to talk about Anvil! M3, our RHEL8 + >> Pacemaker 2 + knet/corosync3 + DRBD 9 VM cluster stack. If time >> allows >> for a second slot, I'd be interested in talking about ScanCore AI, an >> artificial intelligence initiative Alteeve is embarking on with a >> local >> university (which is likely 2~3 years away, so I can leave it for the >> next summit in a couple years if we fill up the speaking slots). > > Awesome, sounds interesting! Let's plan on two slots, I don't think > there will be a problem with that. Thanks! If it gets tight though, consider my second slot removable to make space for others. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Announcing ClusterLabs Summit 2020
On 2019-11-04 9:07 p.m., Ken Gaillot wrote: > Hi all, > > A reminder: We are still interested in ideas for talks, and rough > estimates of potential attendees. "Maybe" is perfectly fine at this > stage. It will let us negotiate hotel rates and firm up the location > details. I will be there. I would like to talk about Anvil! M3, our RHEL8 + Pacemaker 2 + knet/corosync3 + DRBD 9 VM cluster stack. If time allows for a second slot, I'd be interested in talking about ScanCore AI, an artificial intelligence initiative Alteeve is embarking on with a local university (which is likely 2~3 years away, so I can leave it for the next summit in a couple years if we fill up the speaking slots). -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Antw: DLM, cLVM, GFS2 and OCFS2 managed by systemd instead of crm ?
On 2019-10-16 2:16 a.m., Ulrich Windl wrote: >>>> "Lentes, Bernd" schrieb am 15.10.2019 > um > 21:35 in Nachricht > <1922568650.3402980.1571168140600.javamail.zim...@helmholtz-muenchen.de>: >> Hi, >> >> i'm a big fan of simple solutions (KISS). >> Currently i have DLM, cLVM, GFS2 and OCFS2 managed by pacemaker. >> They all are fundamental prerequisites for my resources (Virtual Domains). >> To configure them i used clones and groups. >> Why not having them managed by systemd to make the cluster setup more >> overseeable ? >> >> Is there a strong reason that pacemaker cares about them ? > > AFAIK, DLM (others maybe too) need the cluster infrastructure (comminication > layer) to be operable. > Also I consider systemd handling resources being worse than pacemaker. > What is your specific problem? Keeping the cluster configuration simple while > moving complexity to systemd? > > Do you know one command to describe your systemd configuration as short as the > cluster configuration (like crm configuration show)? > > Regards, > Ulrich This is correct. DLM uses corosync. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/