Re: [ClusterLabs] HA-Cluster, UPS and power outage - how is your setup ?

2022-02-01 Thread Digimer
il/Tools/ScanCore.pm#L1541 This is super high level, and much of the specifics are related to the Anvil! cluster, but it hopefully gives you a starting point on how to approach the problem. We've been doing it this way for many years with really good effect. Cheers

Re: [ClusterLabs] Removing a resource without stopping it

2022-01-29 Thread Digimer
stuff can be bypassed, if the approach works. Best Regards, Strahil Nikolov On Sat, Jan 29, 2022 at 15:43, Digimer wrote:

Re: [ClusterLabs] Removing a resource without stopping it

2022-01-29 Thread Digimer
you just the same though! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sw

Re: [ClusterLabs] Removing a resource without stopping it

2022-01-28 Thread Digimer
On 2022-01-29 00:10, Digimer wrote: On 2022-01-28 16:54, Ken Gaillot wrote: On Fri, 2022-01-28 at 16:38 -0500, Digimer wrote: Hi all, I'm trying to figure out how to move a running VM from one pace

Re: [ClusterLabs] Removing a resource without stopping it

2022-01-28 Thread Digimer
On 2022-01-28 16:54, Ken Gaillot wrote: On Fri, 2022-01-28 at 16:38 -0500, Digimer wrote: Hi all, I'm trying to figure out how to move a running VM from one pacemaker cluster to another. I've got the storage and VM live migration sorted,

Re: [ClusterLabs] Removing a resource without stopping it

2022-01-28 Thread Digimer
On 2022-01-28 16:54, Ken Gaillot wrote: On Fri, 2022-01-28 at 16:38 -0500, Digimer wrote: Hi all, I'm trying to figure out how to move a running VM from one pacemaker cluster to another. I've got the storage and VM live migration sorted,

[ClusterLabs] Removing a resource without stopping it

2022-01-28 Thread Digimer
d. So I am assuming it thought it couldn't stop the service so it self-fenced. In any case, can someone let me know what the proper procedure is?   Said more directly;   How to I delete a resource from pacemaker (via pcs on EL8) without stopping the resource? -- Digimer Papers and

Re: [ClusterLabs] Is there a DRBD forum?

2021-10-19 Thread Digimer
x27;s list at drbd-u...@lists.linbit.com, and they maintain a slack on "linbit-community". -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that pe

Re: [ClusterLabs] Pacemaker 2.1.1 final release now available

2021-09-10 Thread Digimer
Albrigtsen. Congrats to all!! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops."

Re: [ClusterLabs] If You Were Building a DRBD-Based Cluster Today...

2021-08-18 Thread Digimer
something else entirely?   What will the use-case be? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have live

Re: [ClusterLabs] Pacemaker/corosync behavior in case of partial split brain

2021-08-05 Thread Digimer
determined in your case above, I'll let one of the corosync people decide. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived a

Re: [ClusterLabs] 'pcs stonith update' takes, then reverts

2021-07-26 Thread Digimer
On 2021-07-26 12:50 p.m., kgail...@redhat.com wrote: On Mon, 2021-07-26 at 12:25 -0400, Digimer wrote: On 2021-07-26 9:54 a.m., kgail...@redhat.com wrote: On Fri, 2021-07-23 at 21:46 -0400, Digimer wrote

Re: [ClusterLabs] 'pcs stonith update' takes, then reverts

2021-07-26 Thread Digimer
On 2021-07-26 9:54 a.m., kgail...@redhat.com wrote: On Fri, 2021-07-23 at 21:46 -0400, Digimer wrote: After a LOT of hassle, I finally got it updated, but OMG it was painful. I degraded the cluster (unsure if needed), set maintenance mode, deleted

Re: [ClusterLabs] 'pcs stonith update' takes, then reverts

2021-07-26 Thread Digimer
On 2021-07-26 9:54 a.m., kgail...@redhat.com wrote: On Fri, 2021-07-23 at 21:46 -0400, Digimer wrote: After a LOT of hassle, I finally got it updated, but OMG it was painful. I degraded the cluster (unsure if needed), set maintenance mode, deleted

Re: [ClusterLabs] 'pcs stonith update' takes, then reverts

2021-07-23 Thread Digimer
e back with the old configs. Certainly I was doing something wrong, but what? digimer On 2021-07-23 8:04 p.m., Digimer wrote: > Update; > >   Appears I can't even delete the damn things. They re-appeared after > doing a [pcs stonith remove '!. > >   Wow. > > d

Re: [ClusterLabs] 'pcs stonith update' takes, then reverts

2021-07-23 Thread Digimer
Update;   Appears I can't even delete the damn things. They re-appeared after doing a [pcs stonith remove '!.   Wow. digimer On 2021-07-23 7:56 p.m., Digimer wrote: > Hi all, > >   Got a really odd one here... > >   I had a cluster in the lab where it was built and t

[ClusterLabs] 'pcs stonith update' takes, then reverts

2021-07-23 Thread Digimer
ared before updating, still no luck. Any idea what I'm doing wrong? The logs from the node I run the update on, followed by the logs on the peer: digimer Jul 23 16:44:43 an-a02n01.alteeve.com pacemaker-attrd[121631]:  notice: Updating all attributes after cib_refresh_notify event Jul 23

Re: [ClusterLabs] pcs stonith update problems

2021-07-21 Thread Digimer
On 2021-07-21 8:19 a.m., Tomas Jelinek wrote: > Dne 16. 07. 21 v 16:30 Digimer napsal(a): >> On 2021-07-16 9:26 a.m., Tomas Jelinek wrote: >>> Dne 16. 07. 21 v 6:35 Andrei Borzenkov napsal(a): >>>> On 16.07.2021 01:02, Digimer wrote: >>>>> Hi all, >

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-21 Thread Digimer
a SBD device is even better. > > Regards, The third node with storage-based death is a way of creating a fence configuration. It works because it's fencing, not because it's quorum. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in t

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-20 Thread Digimer
2-Node_Myth (note: currently throwing a cert error related to the let's encrypt issue, should be cleared up soon). -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that pe

Re: [ClusterLabs] pcs stonith update problems

2021-07-16 Thread Digimer
On 2021-07-16 10:48 a.m., kgail...@redhat.com wrote: > On Thu, 2021-07-15 at 18:02 -0400, Digimer wrote: >> Hi all, >> >> I've got a predicament... I want to update a stonith resource to >> remove an argument. Specifically, when resource move nodes, I want t

Re: [ClusterLabs] pcs stonith update problems

2021-07-16 Thread Digimer
On 2021-07-16 9:26 a.m., Tomas Jelinek wrote: > Dne 16. 07. 21 v 6:35 Andrei Borzenkov napsal(a): >> On 16.07.2021 01:02, Digimer wrote: >>> Hi all, >>> >>>    I've got a predicament... I want to update a stonith resource to >>> remove an argument.

[ClusterLabs] pcs stonith update problems

2021-07-15 Thread Digimer
elay' value becomes '0'. So it seems that, if an argument previously existed and is NOT specified in an update, it is not removed. Is this intentional for some reason? If so, how would I remove the delay attribute? I've got a fairly complex stonith config, with stonith levels. Del

Re: [ClusterLabs] Pacemaker 2.1.0 final release now available

2021-06-08 Thread Digimer
ository, and the > following wiki page, which distribution packagers and users who build > Pacemaker from source or use Pacemaker command-line tools in scripts > are encouraged to go over carefully: > > https://wiki.clusterlabs.org/wiki/Pacemaker_2.1_Changes Huge congrats!! -- Digime

Re: [ClusterLabs] Cluster Stopped, No Messages?

2021-05-28 Thread Digimer
On 2021-05-28 3:08 p.m., Eric Robinson wrote: > >> -Original Message- >> From: Digimer >> Sent: Friday, May 28, 2021 12:43 PM >> To: Cluster Labs - All topics related to open-source clustering welcomed >> ; Eric Robinson ; Strahil >> Nikolov >&g

Re: [ClusterLabs] Cluster Stopped, No Messages?

2021-05-28 Thread Digimer
Shared storage is not what triggers the need for fencing. Coordinating actions is what triggers the need. Specifically; If you can run resource on both/all nodes at the same time, you don't need HA. If you can't, you need fencing. digimer On 2021-05-28 1:19 p.m., Eric Robinson wrote:

Re: [ClusterLabs] #clusterlabs IRC channel

2021-05-19 Thread Digimer
On 2021-05-19 1:10 p.m., Digimer wrote: > On 2021-05-19 12:58 p.m., Digimer wrote: >> On 2021-05-19 12:55 p.m., kgail...@redhat.com wrote: >>> Hello all, >>> >>> The ClusterLabs community has long used a #clusterlabs IRC channel on >>> the popular IRC

Re: [ClusterLabs] #clusterlabs IRC channel

2021-05-19 Thread Digimer
On 2021-05-19 12:58 p.m., Digimer wrote: > On 2021-05-19 12:55 p.m., kgail...@redhat.com wrote: >> Hello all, >> >> The ClusterLabs community has long used a #clusterlabs IRC channel on >> the popular IRC server freenode.net. >> >> As you may have heard,

Re: [ClusterLabs] #clusterlabs IRC channel

2021-05-19 Thread Digimer
orporate buy-out that was perceived as threatening > the user community's values. > > Many have moved to a new server, libera.chat, organized as a nonprofit. > We have grabbed the #clusterlabs channel there to reserve the name. > (Thanks, digimer!) > > Our options are to

Re: [ClusterLabs] 32 nodes pacemaker cluster setup issue

2021-05-18 Thread Digimer
On 2021-05-18 1:13 p.m., S Sathish S wrote: > Hi Digimer/Team, > >   > > In our product use unicast protocols and CPU load is normal while > problematic timing. > >   > > We don’t defined corosync / totem timing values using default timing > till now, Please s

Re: [ClusterLabs] 32 nodes pacemaker cluster setup issue

2021-05-18 Thread Digimer
ClusterLabs/pacemaker/tree/Pacemaker-2.0.2> > > corosync-2.4.4 -->  https://github.com/corosync/corosync/tree/v2.4.4 > <https://github.com/corosync/corosync/tree/v2.4.4> > > pcs-0.9.169 > >   > > Thanks and Regards, > > S Sathish S As I understand it,

Re: [ClusterLabs] Problem with the cluster becoming mostly unresponsive

2021-05-14 Thread Digimer
On 2021-05-14 6:06 p.m., kgail...@redhat.com wrote: > On Fri, 2021-05-14 at 15:04 -0400, Digimer wrote: >> Hi all, >> >> I'm run into an issue a couple of times now, and I'm not really >> sure >> what's causing it. I've got a RHEL 8 clu

[ClusterLabs] Problem with the cluster becoming mostly unresponsive

2021-05-14 Thread Digimer
errors about fence_delay metadata, that will be fixed and I don't believe it's related. Any advice on what happened, how to avoid it, and how to clean up without a full cluster restart, should it happen again? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, les

Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?

2021-05-11 Thread Digimer
(id:colocation-mysql-server-fs_database-INFINITY) > httpd_srv with mysql-server (score:INFINITY) > (id:colocation-httpd_srv-mysql-server-INFINITY) > Ticket Constraints: > > Alerts: > No alerts defined > > Resources Defaults: > No defaults set > Operations Defaults: > No defaults set > > Cl

Re: [ClusterLabs] fencing

2021-05-07 Thread Digimer
;reboot" ip="10.201.2.4" port="4" power_wait="5" op monitor interval="60" pcs stonith level add 2 an-a02n02 apc_snmp_node2_psu1,apc_snmp_node2_psu2 pcs property set stonith-max-attempts=INFINITY pcs property set stonith-enab

Re: [ClusterLabs] Stopping the last node with pcs

2021-04-28 Thread Digimer
On 2021-04-28 10:10 a.m., Ken Gaillot wrote: > On Tue, 2021-04-27 at 23:23 -0400, Digimer wrote: >> Hi all, >> >> I noticed something odd. >> >> >> [root@an-a02n01 ~]# pcs cluster status >> Cluster Status: >> Cluster Summary: >>

[ClusterLabs] Stopping the last node with pcs

2021-04-27 Thread Digimer
: Stopping the node will cause a loss of the quorum, use --force to override Shouldn't pcs know it's the last node and shut down without complaint? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s br

Re: [ClusterLabs] Cyberpower PDU41001 fencing agent?

2021-04-25 Thread Digimer
e written a couple PDU-based fence agents. They seem to support SNMP and per-outlet switching, so I suspect supporting it would be fairly easy. You'll need to know some OIDs, and the community that allows write access, to set the outlet states. Can you use 'snmpwalk' to collect the

[ClusterLabs] Single-node automated startup question

2021-04-14 Thread Digimer
not aware of. So, A) is there a pacemaker version of post_join_delay? B) is there a compelling argument NOT to use post_join_delay behaviour in pacemaker I am not seeing? Thanks! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and conv

Re: [ClusterLabs] Antw: Re: Antw: RE: Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway

2021-03-05 Thread Digimer
On 2021-03-05 12:26 p.m., Klaus Wenninger wrote: > On 3/5/21 6:04 PM, Digimer wrote: >> On 2021-03-05 2:14 a.m., Ulrich Windl wrote: >>>>> How would the fencing be confirmed? I don't know. >>>> It's part of the FenceAgentAPI. The cluster invokes the fe

Re: [ClusterLabs] [ClusterLabs Developers] fence-virt: consider to merge into fence-agents git repository

2021-03-05 Thread Digimer
eration. +1 to merge -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweats

Re: [ClusterLabs] Antw: Re: Antw: RE: Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway

2021-03-05 Thread Digimer
Us on different switches from the IPMI BMC connections. Fencing really is critical, and as such, it should be certain to work, and ideally, have a backup fence method. So if you find that your fence-azure agent isn't reliable, and you can use SBD as Klaus mentioned, you can configure fence-s

Re: [ClusterLabs] Antw: RE: Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway

2021-03-03 Thread Digimer
ble to fence >> '001db02a'" but It got fenced anyway >> >>>>> Eric Robinson schrieb am 02.03.2021 um >>>>> 19:26 in >> Nachricht >> > 3.prod.outlook.com> >> >>>> -Original Message- >>>> F

Re: [ClusterLabs] Antw: RE: Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway

2021-03-03 Thread Digimer
On 2021-03-03 1:56 a.m., Ulrich Windl wrote: >>>> Eric Robinson schrieb am 02.03.2021 um 19:26 in > Nachricht > > >>> -Original Message- >>> From: Users On Behalf Of Digimer >>> Sent: Monday, March 1, 2021 11:02 AM >>> To: Cl

Re: [ClusterLabs] Antw: [EXT] Re: "Error: unable to fence '001db02a'" but It got fenced anyway

2021-03-01 Thread Digimer
gt; pcmk_monitor_retries=4 pcmk_action_limit=3 >> op monitor interval=3600 >> >> > https://docs.microsoft.com/en‑us/azure/virtual‑machines/workloads/sap/high‑avai > >> lability‑guide‑rhel‑pacemaker >> >> ‑‑ >> Valentin >> _

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Digimer
On 2021-02-26 12:23 p.m., Eric Robinson wrote: >> -Original Message- >> From: Digimer >> Sent: Friday, February 26, 2021 10:35 AM >> To: Cluster Labs - All topics related to open-source clustering welcomed >> ; Eric Robinson >> Subject: Re: [Cl

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-02-26 Thread Digimer
e other node's logs say? What is the cluster configuration? Do you have stonith (fencing) configured? Quorum is a useful tool when things are working properly, but it doesn't help when things enter an undefined / unexpected state. When that happens, stonith saves you. So said anoth

Re: [ClusterLabs] Antw: [EXT] Re: Stop timeout=INFINITY not working

2021-01-27 Thread Digimer
On 2021-01-27 2:29 a.m., Ulrich Windl wrote: >>>> Ken Gaillot schrieb am 26.01.2021 um 16:08 in > Nachricht > : >> On Tue, 2021‑01‑26 at 02:12 ‑0500, Digimer wrote: >>> Hi all, >>> >>> I created a resource with an INFINITE stop timeout; >&g

Re: [ClusterLabs] Stopping all nodes causes servers to migrate

2021-01-26 Thread Digimer
On 2021-01-26 11:27 a.m., Ken Gaillot wrote: > On Tue, 2021-01-26 at 11:03 -0500, Digimer wrote: >> On 2021-01-26 10:15 a.m., Tomas Jelinek wrote: >>> Dne 25. 01. 21 v 17:01 Ken Gaillot napsal(a): >>>> On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthai

Re: [ClusterLabs] Stopping all nodes causes servers to migrate

2021-01-26 Thread Digimer
On 2021-01-26 10:15 a.m., Tomas Jelinek wrote: > Dne 25. 01. 21 v 17:01 Ken Gaillot napsal(a): >> On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais wrote: >>> Hi Digimer, >>> >>> On Sun, 24 Jan 2021 15:31:22 -0500 >>> Digimer wrote: >

[ClusterLabs] Stop timeout=INFINITY not working

2021-01-25 Thread Digimer
will be shut down now.\n ] Did I not configure the stop timeout correctly? Thanks for any insight. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty

Re: [ClusterLabs] Stopping all nodes causes servers to migrate

2021-01-25 Thread Digimer
On 2021-01-25 3:58 p.m., Ken Gaillot wrote: > On Mon, 2021-01-25 at 13:18 -0500, Digimer wrote: >> On 2021-01-25 11:01 a.m., Ken Gaillot wrote: >>> On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais >>> wrote: >>>> Hi Digimer, >>>> &

Re: [ClusterLabs] Stopping all nodes causes servers to migrate

2021-01-25 Thread Digimer
On 2021-01-25 11:01 a.m., Ken Gaillot wrote: > On Mon, 2021-01-25 at 09:51 +0100, Jehan-Guillaume de Rorthais wrote: >> Hi Digimer, >> >> On Sun, 24 Jan 2021 15:31:22 -0500 >> Digimer wrote: >> [...] >>> I had a test server (srv01-test) running on node

[ClusterLabs] Stopping all nodes causes servers to migrate

2021-01-24 Thread Digimer
pacemaker really did ask for a migration. Is this the case? If not, what environment variables should have been set in this scenario? Thanks for any insight! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s br

Re: [ClusterLabs] Stopping a server failed and fenced, despite disabling stop timeout

2021-01-19 Thread Digimer
On 2021-01-19 4:57 a.m., Tomas Jelinek wrote: > Dne 18. 01. 21 v 20:08 Digimer napsal(a): >> On 2021-01-18 4:49 a.m., Tomas Jelinek wrote: >>> Hi Digimer, >>> >>> Regarding pcs behavior: >>> >>> When deleting a resource, pcs first sets

Re: [ClusterLabs] Antw: [EXT] Re: Stopping a server failed and fenced, despite disabling stop timeout

2021-01-18 Thread Digimer
On 2021-01-19 2:27 a.m., Ulrich Windl wrote: >>>> Digimer schrieb am 18.01.2021 um 20:08 in Nachricht > <64c1aa75-a15a-95c3-6853-e21fc0dc8...@alteeve.ca>: >> On 2021-01-18 4:49 a.m., Tomas Jelinek wrote: >>> Hi Digimer, >>> >>> Regarding pcs

Re: [ClusterLabs] Stopping a server failed and fenced, despite disabling stop timeout

2021-01-18 Thread Digimer
On 2021-01-18 1:52 p.m., Ken Gaillot wrote: > On Sun, 2021-01-17 at 21:11 -0500, Digimer wrote: >> Hi all, >> >> Mind the slew of questions, well into testing now and finding lots >> of >> issues. This one is two questions... :) >> >> I set a server

Re: [ClusterLabs] Stopping a server failed and fenced, despite disabling stop timeout

2021-01-18 Thread Digimer
On 2021-01-18 4:49 a.m., Tomas Jelinek wrote: > Hi Digimer, > > Regarding pcs behavior: > > When deleting a resource, pcs first sets its target-role to Stopped, > pushes the change into pacemaker and waits for the resource to stop. > Once the resource stops, pcs removes the

Re: [ClusterLabs] Antw: Antw: [EXT] Stopping a server failed and fenced, despite disabling stop timeout

2021-01-18 Thread Digimer
On 2021-01-18 3:31 a.m., Ulrich Windl wrote: >>>> "Ulrich Windl" schrieb am 18.01.2021 um > 09:28 in Nachricht <6005469702a10003e...@gwsmtp.uni-regensburg.de>: >>>>> Digimer schrieb am 18.01.2021 um 03:11 in Nachricht >> <816a4

[ClusterLabs] Stopping a server failed and fenced, despite disabling stop timeout

2021-01-17 Thread Digimer
and that if a resource was unmanaged, that the resource wouldn't even try to stop (question 2). Can someone help me understand what happened here? digimer More below; [root@el8-a01n01 ~]# pcs resource remove srv01-test Attempting to stop: srv01-test... Warning: 'srv01-test'

[ClusterLabs] Completely disabled resource failure triggered fencing

2021-01-17 Thread Digimer
bersome and still, in testing, I'm finding cases where the node gets fenced when something breaks the resource in a creative way. Thanks for any insight/guidance! -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Ein

Re: [ClusterLabs] Best way to obtain timestamp when node was set to "standby"

2020-10-20 Thread Digimer
st a couple ideas, not sure how well they'd work in practice. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived a

Re: [ClusterLabs] Maintenance mode status in CIB

2020-10-13 Thread Digimer
nce. > > Best Regards, > Strahil Nikolov Can you clarify what you mean by "power off the stack on all nodes"? Do you mean stop pacemaker/corosync/knet daemon themselves without issue? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in t

Re: [ClusterLabs] Maintenance mode status in CIB

2020-10-13 Thread Digimer
On 2020-10-13 5:41 a.m., Jehan-Guillaume de Rorthais wrote: > On Tue, 13 Oct 2020 04:48:04 -0400 > Digimer wrote: > >> On 2020-10-13 4:32 a.m., Jehan-Guillaume de Rorthais wrote: >>> On Mon, 12 Oct 2020 19:08:39 -0400 >>> Digimer wrote: >>> >>&

Re: [ClusterLabs] Maintenance mode status in CIB

2020-10-13 Thread Digimer
On 2020-10-13 4:32 a.m., Jehan-Guillaume de Rorthais wrote: > On Mon, 12 Oct 2020 19:08:39 -0400 > Digimer wrote: > >> Hi all, > > Hi you, > >> >> I noticed that there appear to be a global "maintenance mode" >> attribute under cluster_p

[ClusterLabs] Maintenance mode status in CIB

2020-10-12 Thread Digimer
intenance --all' What is the difference between these attributes? Cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein

Re: [ClusterLabs] Antw: [EXT] Avoiding self-fence on RA failure

2020-10-07 Thread Digimer
On 2020-10-07 2:35 a.m., Ulrich Windl wrote: >>>> Digimer schrieb am 07.10.2020 um 05:42 in Nachricht > : >> Hi all, >> >> While developing our program (and not being a production cluster), I >> find that when I push broken code to a node, causing the RA

Re: [ClusterLabs] Avoiding self-fence on RA failure

2020-10-06 Thread Digimer
On 2020-10-07 2:20 a.m., Digimer wrote: > On 2020-10-07 1:49 a.m., Andrei Borzenkov wrote: >> 07.10.2020 06:42, Digimer пишет: >>> Hi all, >>> >>> While developing our program (and not being a production cluster), I >>> find that when I push bro

Re: [ClusterLabs] Avoiding self-fence on RA failure

2020-10-06 Thread Digimer
On 2020-10-07 1:49 a.m., Andrei Borzenkov wrote: > 07.10.2020 06:42, Digimer пишет: >> Hi all, >> >> While developing our program (and not being a production cluster), I >> find that when I push broken code to a node, causing the RA to fail to >> perform a

[ClusterLabs] Avoiding self-fence on RA failure

2020-10-06 Thread Digimer
imer.ca pacemaker-attrd[33817]: notice: Setting fail-count-srv07-el6#stop_0[mk-a02n01]: (unset) -> INFINITY Oct 06 23:33:54 mk-a02n01.digimer.ca pacemaker-attrd[33817]: notice: Setting last-failure-srv07-el6#stop_0[mk-a02n01]: (unset) -> 1602041634 Connection to mk-a02n01.ifn cl

Re: [ClusterLabs] Antw: [EXT] Re: Determine a resource's current host in the CIB

2020-09-29 Thread Digimer
ttr_name=0 --no-handle-values | \ > sed -e 's/(//' -e 's/)//' -e 's/"//g' -e 's/,/ /g' | grep "$1" | sort -k6n > -k1 > > So on output you have 7 fields: id, operation, on_node, rc-code, queue-time, > exec-time, and last-run >

Re: [ClusterLabs] Determine a resource's current host in the CIB

2020-09-24 Thread Digimer
more directly "VM x is on node y"; I'd like to avoid writing a new parser as I'll still need to read/parse the CIB anyway to know about off resources, and I have to assume there is a way to determine the same from the CIB itself. How to determine wh

Re: [ClusterLabs] Determine a resource's current host in the CIB

2020-09-24 Thread Digimer
e point (ie: what if crm_mon output changes format and breaks the regex used to pull data). Programs like crm_mon and pcs have a way of determining what is running where, and that's fundamentally what I am trying to do as well. digimer > On Wed, Sep 23, 2020 at 11:04 PM Digimer wrote: &g

[ClusterLabs] Determine a resource's current host in the CIB

2020-09-23 Thread Digimer
f so, should I look at which node's 'exec-time' is higher, or which node has the higher 'call-id'? Or am I missing a more obvious way to tell what resource is running on which node? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in

Re: [ClusterLabs] 小型外贸公司为何订单源源不断;

2020-09-18 Thread Digimer
Sorry all, I'm not sure how this spam got through. I may have mis-clicked when filtering the queue. digimer On 2020-09-17 12:15 p.m., hello wrote: > users您好 > > ___ > Manage your subscription: > https://lists.clusterlabs.org/ma

[ClusterLabs] Triggering script on cib change

2020-09-15 Thread Digimer
Is there a way to invoke a script when something happens with the cluster? Be it a simple transition, stonith action, resource dis/enable or resovery, etc? -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s

[ClusterLabs] test, please ignore

2020-09-14 Thread Digimer
Mail server test, please ignore -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - S

Re: [ClusterLabs] Two-node Pacemaker cluster with "fence_aws" fence agent

2020-09-04 Thread Digimer
ld set the stonith device configuration that terminates node 1 to have, say, 'delay="15"'. This way, node 2 looks up how to fence node 1, sees the delay, and sleeps. Node 1 looks up how to fence node 2, sees no delay, and fences immediately. Node 2 is dead before the sleep exits, en

[ClusterLabs] Format of '--lifetime' in 'pcs resource move'

2020-08-20 Thread Digimer
sing '--lifetime=60' as a test, assuming the format was 'seconds', but that was invalid. How is this switch meant to be used? Cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s b

Re: [ClusterLabs] Beginner Question about VirtualDomain

2020-08-18 Thread Digimer
depends on the rest of our tools so it won't work outside the Anvil!. That said, if you wanted to use it before we release Anvil! M3, you could probably adapt it easily enough. If you have any questions, please let me know and I'll help as best I can. Cheers, digimer (Note: du

Re: [ClusterLabs] Warning; EL6 kernel 2.6.32-754.28.1 breaks bonding!

2020-06-23 Thread Digimer
On 2020-06-23 5:59 p.m., Hayden,Robert wrote: > >> -Original Message- >> From: Users On Behalf Of Digimer >> Sent: Monday, April 27, 2020 5:12 PM >> To: Cluster Labs - Users >> Subject: [ClusterLabs] Warning; EL6 kernel 2.6.32-754.28.1 breaks bonding!

[ClusterLabs] Warning; EL6 kernel 2.6.32-754.28.1 breaks bonding!

2020-04-27 Thread Digimer
' (or '/boot/efi/EFI/redhat/grub.conf') and change 'default=X' to the .27.1 kernel entry, reboot. After reboot, remove .28.1; -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than

[ClusterLabs] New APC AP7900 switch PDU and fencing

2020-04-21 Thread Digimer
leave that as an exercise for the reader. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields a

[ClusterLabs] Anvil! M2 v2.0.8 released

2020-04-21 Thread Digimer
NOTE: M3 development is happening outside the Clusterlabs repo. It will be moved over once it reaches beta. For those wishing to follow it during alpha, the repo is: https://github.com/digimer/anvil -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in th

Re: [ClusterLabs] NFS in different subnets

2020-04-18 Thread Digimer
On 2020-04-18 2:48 a.m., Strahil Nikolov wrote: > On April 18, 2020 8:43:51 AM GMT+03:00, Digimer wrote: >> For what it's worth; A lot of HA specialists spent a lot of time trying >> to find the simplest _reliable_ way to do multi-site/geo-replicated HA. >> I am cer

Re: [ClusterLabs] NFS in different subnets

2020-04-17 Thread Digimer
there's a high chance you will corrupt data when you need it most. Of course, there's always a chance you'll come up with a system no one else has thought of, just be aware of what you know and what you don't. HA is fun, in big part, because it's a challenge to get right.

Re: [ClusterLabs] NFS in different subnets

2020-04-17 Thread Digimer
On 2020-04-17 3:20 p.m., Daniel Smith wrote: > Thank you digimer, and I apologize for getting the wrong email. > >   > > Booth was the piece I was missing.  Have been researching setting that > up and finding a third location for quorum. From what I have found, I > believe

Re: [ClusterLabs] NFS in different subnets

2020-04-17 Thread Digimer
uster), or it's destroyed.   After this, it just becomes a question of implementation details. Have the master side update a DNS entry should be fine (though you may need to write a small resource agent to do it, not sure if one exists for DNS yet). digimer On

Re: [ClusterLabs] Solidarity during these extraordinary times

2020-03-18 Thread Digimer
together. > > Wishing you and your loved ones the best, > <3 -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal

Re: [ClusterLabs] A note for digimer re: qdevice documentation

2020-02-05 Thread Digimer
access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/configuring_and_managing_high_availability_clusters/index#assembly_configuring-quorum-devices-configuring-cluster-quorum > > > Steven Thanks! -- Digimer Papers and Projects: https://alteeve.com/w/ "I a

Re: [ClusterLabs] 2020 Summit is right around the corner!

2019-12-02 Thread Digimer
I pasted my password into fpaste/IRC a month ago. So... ya. :) digimer On 2019-12-02 2:24 p.m., Steven Levine wrote: > I did *not* mean this message to go to the whole list. My profuse apologies. > > I haven't made this rookie mistake in a decade or more. > >

Re: [ClusterLabs] Final Pacemaker 2.0.3 release now available

2019-11-27 Thread Digimer
On 2019-11-27 7:27 p.m., Ken Gaillot wrote: > On Mon, 2019-11-25 at 23:02 -0500, Digimer wrote: >> Congrats! >> >> Can I ask, when might fencing become required? Is that still in the >> works, or has it been shelved? >> >> digimer > > tl;dr shelved &g

Re: [ClusterLabs] Final Pacemaker 2.0.3 release now available

2019-11-25 Thread Digimer
Congrats! Can I ask, when might fencing become required? Is that still in the works, or has it been shelved? digimer On 2019-11-25 9:32 p.m., Ken Gaillot wrote: > Hi all, > > The final release of Pacemaker version 2.0.3 is now available at: > > https://github.com/Cluste

Re: [ClusterLabs] Announcing ClusterLabs Summit 2020

2019-11-11 Thread Digimer
+ HA stack, if there's > interest at all :) > > cheers, > Thomas Certainly! The more the merrier. :) -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people o

Re: [ClusterLabs] Announcing ClusterLabs Summit 2020

2019-11-07 Thread Digimer
gt; about a > PostgreSQL 12 support of the pgsql resource agent, or share our test results > and the issues from a user's point of view. > > Look forward to seeing you guys. > Thanks, Woohoo! Looking forward to seeing you again. :) -- Digimer Papers and Projects: https://alte

Re: [ClusterLabs] Antw: Re: Announcing ClusterLabs Summit 2020

2019-11-07 Thread Digimer
complete discussions will be. Being an attendee who listens and gives feedback is, itself, well worth it. So if you're on the fence, come. The more people who attend, the better for all. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the we

Re: [ClusterLabs] Announcing ClusterLabs Summit 2020

2019-11-05 Thread Digimer
could participate in the event and, possibly, have a slot > to share our experience and the work being done, specially related to > testing in other architectures, like arm64 and s390x. > > I also hope this opportunity can make us closer to upstream, so we can > start contributing

Re: [ClusterLabs] Announcing ClusterLabs Summit 2020

2019-11-05 Thread Digimer
On 2019-11-05 10:09 a.m., Ken Gaillot wrote: > On Tue, 2019-11-05 at 00:39 -0500, Digimer wrote: >> On 2019-11-04 9:07 p.m., Ken Gaillot wrote: >>> Hi all, >>> >>> A reminder: We are still interested in ideas for talks, and rough >>> estimates of pote

Re: [ClusterLabs] Announcing ClusterLabs Summit 2020

2019-11-04 Thread Digimer
ersity (which is likely 2~3 years away, so I can leave it for the next summit in a couple years if we fill up the speaking slots). -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty

Re: [ClusterLabs] Antw: DLM, cLVM, GFS2 and OCFS2 managed by systemd instead of crm ?

2019-10-15 Thread Digimer
is your specific problem? Keeping the cluster configuration simple while > moving complexity to systemd? > > Do you know one command to describe your systemd configuration as short as the > cluster configuration (like crm configuration show)? > > Regards, > Ulrich This

  1   2   3   4   5   6   >