from:"Ken Gaillot"

[ClusterLabs] FYI: clusterlabs.org planned outages

2024-05-07 Thread Ken Gaillot

Hi all,

We are in the process of changing the OS on the servers used to run the
clusterlabs.org sites. There is an expected outage of all services from
4AM to 9AM UTC this Thursday. If problems arise, there may be more
outages later Thursday and Friday.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Fast-failover on 2 nodes + qnetd: qdevice connenction disrupted.

2024-05-06 Thread Ken Gaillot

On Mon, 2024-05-06 at 10:05 -0500, Ken Gaillot wrote:
> On Fri, 2024-05-03 at 16:18 +0300, ale...@pavlyuts.ru wrote:
> > Hi,
> > 
> > > > Thanks great for your suggestion, probably I need to think
> > > > about
> > > > this
> > > > way too, however, the project environment is not a good one to
> > > > rely on
> > > > fencing and, moreover, we can't control the bottom layer a
> > > > trusted
> > > > way.
> > > 
> > > That is a problem. A VM being gone is not the only possible
> > > failure
> > > scenario. For
> > > example, a kernel or device driver issue could temporarily freeze
> > > the node, or
> > > networking could temporarily drop out, causing the node to appear
> > > lost to
> > > Corosync, but the node could be responsive again (with the app
> > > running) after the
> > > app has been started on the other node.
> > > 
> > > If there's no problem with the app running on both nodes at the
> > > same time, then
> > > that's fine, but that's rarely the case. If an IP address is
> > > needed, or shared storage
> > > is used, simultaneous access will cause problems that only
> > > fencing
> > > can avoid.
> > The pacemaker use very pessimistic approach if you set resources to
> > require quorum. 
> > If a network outage is trigger changes, it will ruin quorum first
> > and
> > after that try to rebuild it. Therefore there are two questions: 
> > 1. How to keep active app running?
> > 2. How to prevent two copies started.
> > As for me, quorum-dependent resource management performs well on
> > both
> > points.
> 
> That's fine as long as the cluster is behaving properly. Fencing is
> for
> when it's not.
> 
> Quorum prevents multiple copies only if the nodes can communicate and
> operate normally. There are many situations when that's not true: a
> device driver or kernel bug locks up a node for more than the
> Corosync
> token timeout, CPU or I/O load gets high enough for a node to become
> unresponsive for long stretches of time, a failing network controller
> randomly drops large numbers of packets, etc.
> 
> In such situations, a node that appears lost to the other nodes may
> actually just be temporarily unreachable, and may come back at any
> moment (with its resources still active).
> 
> If an IP address is active in more than one location, packets will be
> randomly routed to one node or another, rendering all communication
> via
> that IP useless. If an application that uses shared storage is active
> in more than one location, data can be corrupted. And so forth.
> 
> Fencing ensures that the lost node is *definitely* not running
> resources before recovering them elsewhere.
> 
> > > > my goal is to keep the app from moves (e.g. restarts) as long
> > > > as
> > > > possible. This means only two kinds of moves accepted: current
> > > > host
> > > > fail (move to other with restart) or admin move (managed move
> > > > at
> > > > certain time with restart). Any other troubles should NOT
> > > > trigger
> > > > app
> > > > down/restart. Except of total connectivity loss where no second
> > > > node,
> > > > no arbiter => stop service.
> > > 
> > > Total connectivity loss may not be permanent. Fencing ensures the
> > > connectivity
> > > will not be restored after the app is started elsewhere.
> > Nothing bad if it restored and the node alive, but got app down
> > because of no quorum.
> 
> Again, that assumes it is operating normally. HA is all about the
> times
> when it's not.
>  
> > > Pacemaker 2.0.4 and later supports priority-fencing-delay which
> > > allows the node
> > > currently running the app to survive. The node not running the
> > > app
> > > will wait the
> > > configured amount of time before trying to fence the other node.
> > > Of
> > > course that
> > > does add more time to the recovery if the node running the app is
> > > really gone.
> > I feel I am not sure about how it works.
> > Imagine just connectivity loss between nodes but no to the other
> > pars.
> > And Node1 runs app. Everything well, node2 off.
> > So, we start Node2 with intention to restore cluster.
> > Node 2 starts and trying to find it's partner, failure and fence
> > node1 out.
> > While Node1 not even know about Node2 starts.
> > 
> >

Re: [ClusterLabs] Fast-failover on 2 nodes + qnetd: qdevice connenction disrupted.

2024-05-06 Thread Ken Gaillot

eliable without fencing.
> May a host hold quorum bit longer than another host got quorum and
> run app. Probably, it may do this.
> But fencing is not immediate too. So, it can't protect for 100% from
> short-time parallel runs.

Certainly -- with fencing enabled, the cluster will not recover
resources elsewhere until fencing succeeds.

> 
> > That does complicate the situation. Ideally there would be some way
> > to request
> > the VM to be immediately destroyed (whether via fence_xvm, a cloud
> > provider
> > API, or similar).
> What you mean by "destroyed"? Mean get down?

Correct. For fencing purposes, it should not be a clean shutdown but an
immediate halt.

> 
> > > Please, mind all the above is from my common sense and quite poor
> > > fundamental knowledge in clustering. And please be so kind to
> > > correct
> > > me if I am wrong at any point.
> > > 
> > > Sincerely,
> > > 
> > > Alex
> > > -Original Message-
> > > From: Users  On Behalf Of Ken
> > > Gaillot
> > > Sent: Thursday, May 2, 2024 5:55 PM
> > > To: Cluster Labs - All topics related to open-source clustering
> > > welcomed 
> > > Subject: Re: [ClusterLabs] Fast-failover on 2 nodes + qnetd:
> > > qdevice
> > > connenction disrupted.
> > > 
> > > I don't see fencing times in here -- fencing is absolutely
> > > essential.
> > > 
> > > With the setup you describe, I would drop qdevice. With fencing,
> > > quorum is not strictly required in a two-node cluster (two_node
> > > should
> > > be set in corosync.conf). You can set priority-fencing-delay to
> > > reduce
> > > the chance of simultaneous fencing. For VMs, you can use
> > > fence_xvm,
> > > which is extremely quick.
> > > 
> > > On Thu, 2024-05-02 at 02:56 +0300, ale...@pavlyuts.ru wrote:
> > > > Hi All,
> > > > 
> > > > I am trying to build application-specific 2-node failover
> > > > cluster
> > > > using ubuntu 22, pacemaker 2.1.2 + corosync 3.1.6 and DRBD
> > > > 9.2.9,
> > > > knet transport.
> > > > 
> > > > For some reason I can’t use 3-node then I have to use
> > > > qnetd+qdevice
> > > > 3.0.1.
> > > > 
> > > > The main goal Is to protect custom app which is not cluster-
> > > > aware by
> > > > itself. It is quite stateful, can’t store the state outside
> > > > memory
> > > > and take some time to get converged with other parts of the
> > > > system,
> > > > then the best scenario is “failover is a restart with same
> > > > config”,
> > > > but each unnecessary restart is painful. So, if failover done,
> > > > app
> > > > must retain on the backup node until it fail or admin push it
> > > > back,
> > > > this work well with stickiness param.
> > > > 
> > > > So, the goal is to detect serving node fail ASAP and restart it
> > > > ASAP
> > > > on other node, using DRBD-synced config/data. ASAP means within
> > > > 5-
> > > > 7
> > > > sec, not 30 or more.
> > > > 
> > > > I was tried different combinations of timing, and finally got
> > > > acceptable result within 5 sec for the best case. But! The case
> > > > is
> > > > very unstable.
> > > > 
> > > > My setup is a simple: two nodes on VM, and one more VM as
> > > > arbiter
> > > > (qnetd), VMs under Proxmox and connected by net via external
> > > > ethernet switch to get closer to reality where “nodes VM”
> > > > should
> > > > locate as VM on different PHY hosts in one rack.
> > > > 
> > > > Then, it was adjusted for faster detect and failover.
> > > > In Corosync, left the token default 1000ms, but add
> > > > “heartbeat_failures_allowed: 3”, this made corosync catch node
> > > > failure for about 200ms (4x50ms heartbeat).
> > > > Both qnet and qdevice was run
> > > > with  net_heartbeat_interval_min=200
> > > > to allow play with faster hearbeats and detects Also,
> > > > quorum.device.net has timeout: 500, sync_timeout: 3000, algo:
> > > > LMS.
> > > > 
> > > > The testing is to issue “ate +%M:%S.%N && qm stop 201”, and
> > > > then
> > > > check the logs on timestamp when the app started

Re: [ClusterLabs] Fast-failover on 2 nodes + qnetd: qdevice connenction disrupted.

2024-05-02 Thread Ken Gaillot

On Thu, 2024-05-02 at 22:56 +0300, ale...@pavlyuts.ru wrote:
> Dear Ken, 
> 
> First of all, there no fencing at all, it is off.
> 
> Thanks great for your suggestion, probably I need to think about this
> way too, however, the project environment is not a good one to rely
> on fencing and, moreover, we can't control the bottom layer a trusted
> way.

That is a problem. A VM being gone is not the only possible failure
scenario. For example, a kernel or device driver issue could
temporarily freeze the node, or networking could temporarily drop out,
causing the node to appear lost to Corosync, but the node could be
responsive again (with the app running) after the app has been started
on the other node.

If there's no problem with the app running on both nodes at the same
time, then that's fine, but that's rarely the case. If an IP address is
needed, or shared storage is used, simultaneous access will cause
problems that only fencing can avoid.

> 
> As I understand, fence_xvm just kills VM that not inside the quorum
> part, or, in a case of two-host just one survive who shoot first. But

Correct

> my goal is to keep the app from moves (e.g. restarts) as long as
> possible. This means only two kinds of moves accepted: current host
> fail (move to other with restart) or admin move (managed move at
> certain time with restart). Any other troubles should NOT trigger app
> down/restart. Except of total connectivity loss where no second node,
> no arbiter => stop service.
> 

Total connectivity loss may not be permanent. Fencing ensures the
connectivity will not be restored after the app is started elsewhere.

> AFAIK, fencing in two-nodes creates undetermined fence racing, and
> even it warrants only one node survive, it has no respect to if the
> app already runs on the node or not. So, the situation: one node 
> already run app, while other lost its connection to the first, but
> not to the fence device. And win the race => kill current active =>
> app restarts. That's exactly what I am trying to avoid.


Pacemaker 2.0.4 and later supports priority-fencing-delay which allows
the node currently running the app to survive. The node not running the
app will wait the configured amount of time before trying to fence the
other node. Of course that does add more time to the recovery if the
node running the app is really gone.

> 
> Therefore, quorum-based management seems better way for my exact
> case.

Unfortunately it's unsafe without fencing.

> 
> Also, VM fencing rely on the idea that all VMs are inside a well-
> managed first layer cluster with it's own quorum/fencing on place or
> separate nodes and VMs never moved between without careful fencing
> reconfig. In mu case, I can't be sure about both points, I do not
> manage bottom layer. The max I can do is to request that every my MV
> (node, arbiter) located on different phy node and this may protect
> app from node failure and bring more freedom to get nodes off for
> service. Also, I have to limit overall MV count while there need for
> multiple app instances (VM pairs) running at once and one extra VM as
> arbiter for all them (2*N+1), but not 3-node for each instance (3*N)
> which could be more reasonable for my opinion, but not for one who
> allocate resources.

That does complicate the situation. Ideally there would be some way to
request the VM to be immediately destroyed (whether via fence_xvm, a
cloud provider API, or similar).

> 
> Please, mind all the above is from my common sense and quite poor
> fundamental knowledge in clustering. And please be so kind to correct
> me if I am wrong at any point.
> 
> Sincerely,
> 
> Alex
> -Original Message-
> From: Users  On Behalf Of Ken Gaillot
> Sent: Thursday, May 2, 2024 5:55 PM
> To: Cluster Labs - All topics related to open-source clustering
> welcomed 
> Subject: Re: [ClusterLabs] Fast-failover on 2 nodes + qnetd: qdevice
> connenction disrupted.
> 
> I don't see fencing times in here -- fencing is absolutely essential.
> 
> With the setup you describe, I would drop qdevice. With fencing,
> quorum is not strictly required in a two-node cluster (two_node
> should be set in corosync.conf). You can set priority-fencing-delay
> to reduce the chance of simultaneous fencing. For VMs, you can use
> fence_xvm, which is extremely quick.
> 
> On Thu, 2024-05-02 at 02:56 +0300, ale...@pavlyuts.ru wrote:
> > Hi All,
> >  
> > I am trying to build application-specific 2-node failover cluster 
> > using ubuntu 22, pacemaker 2.1.2 + corosync 3.1.6 and DRBD 9.2.9,
> > knet 
> > transport.
> >  
> > For some reason I can’t use 3-node then I have to use
> > qnetd+qdevice 
> > 3.0.1.
> >  
> > The main goal Is to protect custom a

Re: [ClusterLabs] Fast-failover on 2 nodes + qnetd: qdevice connenction disrupted.

2024-05-02 Thread Ken Gaillot

ice[781]: Received preinit reply
> msg
> May 01 23:30:56 node2 corosync-qdevice[781]: Received init reply msg
> May 01 23:30:56 node2 corosync-qdevice[781]: Scheduling send of
> heartbeat every 400ms
> May 01 23:30:56 node2 corosync-qdevice[781]: Executing after-connect
> heuristics.
> May 01 23:30:56 node2 corosync-qdevice[781]: worker:
> qdevice_heuristics_worker_cmd_process_exec: Received exec command
> with seq_no "25" and timeout "250"
> May 01 23:30:56 node2 corosync-qdevice[781]: Received heuristics exec
> result command with seq_no "25" and result "Disabled"
> May 01 23:30:56 node2 corosync-qdevice[781]: Algorithm decided to
> send config node list, send membership node list, send quorum node
> list, heuristics is Undefined and result vote is Wait for reply
> May 01 23:30:56 node2 corosync-qdevice[781]: Sending set option seq =
> 98, HB(0) = 0ms, KAP Tie-breaker(1) = Enabled
> May 01 23:30:56 node2 corosync-qdevice[781]: Sending config node list
> seq = 99
> May 01 23:30:56 node2 corosync-qdevice[781]:   Node list:
> May 01 23:30:56 node2 corosync-qdevice[781]: 0 node_id = 1,
> data_center_id = 0, node_state = not set
> May 01 23:30:56 node2 corosync-qdevice[781]: 1 node_id = 2,
> data_center_id = 0, node_state = not set
> May 01 23:30:56 node2 corosync-qdevice[781]: Sending membership node
> list seq = 100, ringid = (2.801), heuristics = Undefined.
> May 01 23:30:56 node2 corosync-qdevice[781]:   Node list:
> May 01 23:30:56 node2 corosync-qdevice[781]: 0 node_id = 2,
> data_center_id = 0, node_state = not set
> May 01 23:30:56 node2 corosync-qdevice[781]: Sending quorum node list
> seq = 101, quorate = 0
> May 01 23:30:56 node2 corosync-qdevice[781]:   Node list:
> May 01 23:30:56 node2 corosync-qdevice[781]: 0 node_id = 1,
> data_center_id = 0, node_state = dead
> May 01 23:30:56 node2 corosync-qdevice[781]: 1 node_id = 2,
> data_center_id = 0, node_state = member
> May 01 23:30:56 node2 corosync-qdevice[781]: Cast vote timer is now
> stopped.
> May 01 23:30:56 node2 corosync-qdevice[781]: Received set option
> reply seq(1) = 98, HB(0) = 0ms, KAP Tie-breaker(1) = Enabled
> May 01 23:30:56 node2 corosync-qdevice[781]: Received initial config
> node list reply
> May 01 23:30:56 node2 corosync-qdevice[781]:   seq = 99
> May 01 23:30:56 node2 corosync-qdevice[781]:   vote = No change
> May 01 23:30:56 node2 corosync-qdevice[781]:   ring id = (2.801)
> May 01 23:30:56 node2 corosync-qdevice[781]: Algorithm result vote is
> No change
> May 01 23:30:56 node2 corosync-qdevice[781]: Received membership node
> list reply
> May 01 23:30:56 node2 corosync-qdevice[781]:   seq = 100
> May 01 23:30:56 node2 corosync-qdevice[781]:   vote = ACK
> May 01 23:30:56 node2 corosync-qdevice[781]:   ring id = (2.801)
> May 01 23:30:56 node2 corosync-qdevice[781]: Algorithm result vote is
> ACK
> May 01 23:30:56 node2 corosync-qdevice[781]: Cast vote timer is now
> scheduled every 250ms voting ACK.
> May 01 23:30:56 node2 corosync-qdevice[781]: Received quorum node
> list reply
> May 01 23:30:56 node2 corosync-qdevice[781]:   seq = 101
> May 01 23:30:56 node2 corosync-qdevice[781]:   vote = ACK
> May 01 23:30:56 node2 corosync-qdevice[781]:   ring id = (2.801)
> May 01 23:30:56 node2 corosync-qdevice[781]: Algorithm result vote is
> ACK
> May 01 23:30:56 node2 corosync-qdevice[781]: Cast vote timer remains
> scheduled every 250ms voting ACK.
> May 01 23:30:56 node2 corosync-qdevice[781]: Votequorum quorum notify
> callback:
> May 01 23:30:56 node2 corosync-qdevice[781]:   Quorate = 1
> May 01 23:30:56 node2 corosync-qdevice[781]:   Node list (size = 3):
> May 01 23:30:56 node2 corosync-qdevice[781]: 0 nodeid = 1, state
> = 2
> May 01 23:30:56 node2 corosync-qdevice[781]: 1 nodeid = 2, state
> = 1
> May 01 23:30:56 node2 corosync-qdevice[781]: 2 nodeid = 0, state
> = 0
> May 01 23:30:56 node2 corosync-qdevice[781]: algo-lms: quorum_notify.
> quorate = 1
> May 01 23:30:56 node2 corosync-qdevice[781]: Algorithm decided to
> send list and result vote is No change
> May 01 23:30:56 node2 corosync-qdevice[781]: Sending quorum node list
> seq = 102, quorate = 1
> May 01 23:30:56 node2 corosync-qdevice[781]:   Node list:
> May 01 23:30:56 node2 corosync-qdevice[781]: 0 node_id = 1,
> data_center_id = 0, node_state = dead
> May 01 23:30:56 node2 corosync-qdevice[781]: 1 node_id = 2,
> data_center_id = 0, node_state = member
> May 01 23:30:56 node2 corosync-qdevice[781]: Received quorum node
> list reply
> May 01 23:30:56 node2 corosync-qdevice[781]:   seq = 102
> May 01 23:30:56 node2 corosync-qdevice[781]:   vote = ACK
> May 01 23:30:56 node2 corosync-qdevice[781]:   ring id = (2

Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix

2024-04-18 Thread Ken Gaillot

What OS are you using? Does it use systemd?

What does happen when you kill Corosync?

On Thu, 2024-04-18 at 13:13 +, NOLIBOS Christophe via Users wrote:
> Classified as: {OPEN}
> 
> Dear All,
>  
> I have a question about the "pacemakerd: recover properly from
> Corosync crash" fix implemented in version 2.1.2.
> I have observed the issue when testing pacemaker version 2.0.5, just
> by killing the ‘corosync’ process: Corosync was not recovered.
>  
> I am using now pacemaker version 2.1.5-8.
> Doing the same test, I have the same result: Corosync is still not
> recovered.
>  
> Please confirm the "pacemakerd: recover properly from Corosync crash"
> fix implemented in version 2.1.2 covers this scenario.
> If it is, did I miss something in the configuration of my cluster?
>  
> Best Regard.
>  
> Christophe.
>  
> 
> Christophe NolibosDL-FEP Component ManagerTHALES Land & Air
> Systems105, avenue du Général Eisenhower, 31100 Toulouse, FRANCETél.
> : +33 (0)5 61 19 79 09Mobile : +33 (0)6 31 22 20 58
> Email : christophe.noli...@thalesgroup.com
>  
>  
> 
> {OPEN}
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Likely deprecation: ocf:pacemaker:o2cb resource agent

2024-04-17 Thread Ken Gaillot

Hi all,

I just discovered today that the OCFS2 file system hasn't needed
ocf_controld.pcmk in nearly a decade. I can't recall ever running
across anyone using the ocf:pacemaker:o2cb agent that manages that
daemon in a cluster.

Unless anyone has a good reason to the contrary, we'll deprecate the
agent for the Pacemaker 2.1.8 release and drop it for 3.0.0.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Potential deprecation: Node-attribute-based rules in operation meta-attributes

2024-04-02 Thread Ken Gaillot

Hi all,

I have recently been cleaning up Pacemaker's rule code, and came across
an inconsistency.

Currently, meta-attributes may have rules with date/time-based
expressions (the  element). Node attribute
expressions (the  element) are not allowed, with the
exception of operation meta-attributes (beneath an  or
 element).

I'd like to deprecate support for node attribute expressions for
operation meta-attributes in Pacemaker 2.1.8, and drop support in
3.0.0.

I don't think it makes sense to vary meta-attributes by node. For
example, if a clone monitor has on-fail="block" (to cease all actions
on instances everywhere) on one node and on-fail="stop" (to stop all
instances everywhere) on another node, what should the cluster do if
monitors fail on both nodes? It seems to me that it's more likely to be
confusing than helpful.

If anyone has a valid use case for node attribute expressions for
operation meta-attributes, now is the time to speak up!
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Potential deprecation: Disabling schema validation for the CIB

2024-04-02 Thread Ken Gaillot

Hi all,

Pacemaker uses an XML schema to prevent invalid syntax from being added
to the CIB. The CIB's "validate-with" option is typically set to a
version of this schema (like "pacemaker-3.9").

It is possible to explicitly disable schema validation by setting
validate-with to "none". This is clearly a bad idea since it allows
invalid syntax to be added, which will at best be ignored and at worst
cause undesired or buggy behavior.

I'm thinking of deprecating the ability to use "none" in Pacemaker
2.1.8 and dropping support in 3.0.0. If anyone has a valid use case for
this feature, now is the time to speak up!
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] resources cluster stoped with one node

2024-03-20 Thread Ken Gaillot

On Wed, 2024-03-20 at 23:29 +0100, mierdatutis mi wrote:
> HI,
> I've configured a cluster of two nodes.
> When I start one node only I see that the resources won't start.

Hi,

In a two-node cluster, it is not safe to start resources until the
nodes have seen each other once. Otherwise, there's no way to know
whether the other node is unreachable because it is safely down or
because communication has been interrupted (meaning it could still be
running resources).

Corosync's two_node setting automatically takes care of that by also
enabling wait_for_all. If you are certain that the other node is down,
you can disable wait_for_all in the Corosync configuration, start the
node, then re-enable wait_for_all.


> 
> [root@nodo1 ~]# pcs status --full
> Cluster name: mycluster
> Stack: corosync
> Current DC: nodo1 (1) (version 1.1.23-1.el7-9acf116022) - partition
> WITHOUT quorum
> Last updated: Wed Mar 20 23:28:45 2024
> Last change: Wed Mar 20 19:33:09 2024 by root via cibadmin on nodo1
> 
> 2 nodes configured
> 3 resource instances configured
> 
> Online: [ nodo1 (1) ]
> OFFLINE: [ nodo2 (2) ]
> 
> Full list of resources:
> 
>  Virtual_IP (ocf::heartbeat:IPaddr2):   Stopped
>  Resource Group: HA-LVM
>  My_VG  (ocf::heartbeat:LVM-activate):  Stopped
>  My_FS  (ocf::heartbeat:Filesystem):Stopped
> 
> Node Attributes:
> * Node nodo1 (1):
> 
> Migration Summary:
> * Node nodo1 (1):
> 
> Fencing History:
> 
> PCSD Status:
>   nodo1: Online
>   nodo2: Offline
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> Do you know what these behaviors are?
> Thanks
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Can I add a pacemaker v2 node to a v2 cluster ?

2024-03-04 Thread Ken Gaillot

On Fri, 2024-03-01 at 15:41 +, Morgan Cox wrote:
> Hi - i have a fair few rhel7 pacemaker clusters (running on pacemaker
> v1)   I want to migrate to rhel8 (which uses pacemaker v2) - due to
> using shared/cluster ip it would be a pain  and involve downtime to
> take down cluster ip on rhel7 cluster then setup on rhel8.
> 
> Can I add a pacemaker v2 node to a pacemaker v1 one  (in order to
> migrate easily..)

No, the Corosync versions are incompatible. The closest you could do
would be to set up booth between the Pacemaker 1 and 2 clusters, then
grant the ticket to the new cluster. That's effectively the same as
manually disabling the IP on the old cluster and enabling it on the new
one, but a little more automated (which is more helpful the more
resources you have to move).

> 
> i.e i have these versions 
> 
> rhel7 : pacemaker-1.1.23-1.el7_9.1.x86_64
> rhel8: pacemaker-2.1.5-8.el8.x86_64
> 
> For the purposes of migrating to rhel8 - could i add a rhel8
> (pacemaker v2) node to an existing rhel7 (pacemaker v1.x) cluster 
> 
> If it matters here is the 'supporting' versions of both (using #
> pacemakerd --features )
> 
> rhel7: Supporting v3.0.14:  generated-manpages agent-manpages ncurses
> libqb-logging libqb-ipc systemd nagios  corosync-native atomic-attrd
> acls
> 
> rhel8 :  Supporting v3.16.2: agent-manpages cibsecrets compat-2.0
> corosync-ge-2 default-concurrent-fencing default-sbd-sync generated-
> manpages monotonic nagios ncurses remote systemd
> 
> If this is not possible I will have to think of another solution.
> 
> Thanks 
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Is it possible to downgrade feature-set in 2.1.6-8

2024-02-26 Thread Ken Gaillot

On Thu, 2024-02-22 at 08:05 -0500, vitaly wrote:
> Hello. 
> We have a product with 2 node clusters.
> Our current version is using Pacemaker 2.1.4 the new version will be
> using Pacemaker 2.1.6
> During upgrade failure it is possible that one node will come up with
> the new Pacemaker and work alone for a while.
> Then old node would later come up and try to join the cluster.
> This would fail due to the different feature-sets of the cluster
> nodes. The older feature-set would not be able to join the newer
> feature-set.
>  
> Question: 
> Is is possible to force new node with Pacemaker 2.1.6 to use older
> feature-set (3.15.0) for a while until second node is upgraded and is
> able to work with Pacemaker 2.1.6?

No

>  
> Thank you very much!
> _Vitaly
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] clone_op_key pcmk__notify_key - Triggered fatal assertion

2024-02-19 Thread Ken Gaillot

On Sat, 2024-02-17 at 13:39 +0100, lejeczek via Users wrote:
> Hi guys.
> 
> Everything seems to be working a ok yet pacemakers logs
> ...
>  error: clone_op_key: Triggered fatal assertion at
> pcmk_graph_producer.c:207 : (n_type != NULL) && (n_task != NULL)
>  error: pcmk__notify_key: Triggered fatal assertion at actions.c:187
> : op_type != NULL
>  error: clone_op_key: Triggered fatal assertion at
> pcmk_graph_producer.c:207 : (n_type != NULL) && (n_task != NULL)
>  error: pcmk__notify_key: Triggered fatal assertion at actions.c:187
> : op_type != NULL
> ...
>  error: pcmk__create_history_xml: Triggered fatal assertion at
> pcmk_sched_actions.c:1163 : n_type != NULL
>  error: pcmk__create_history_xml: Triggered fatal assertion at
> pcmk_sched_actions.c:1164 : n_task != NULL
>  error: pcmk__notify_key: Triggered fatal assertion at actions.c:187
> : op_type != NULL
> ...
> 
> Looks critical, is it it - would you know?
> many thanks, L.
> 

That's odd. This suggests the scheduler created a notify action without
adding all the necessary information, which would be a bug. Do you have
the scheduler input that causes these messages? 

Also, what version are you using, and how did you get it?
--
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] pacemaker resource configure issue

2024-02-08 Thread Ken Gaillot

On Thu, 2024-02-08 at 10:12 +0800, hywang via Users wrote:
> hello, everyone,
>  I want to make a node fenced or the cluster stopped after a
> resource start failed 3 times, how to make the resource configure to
> achive it?
> Thanks!
> 

The current design doesn't allow it. You can set start-failure-is-fatal 
to false to let the cluster reattempt the start and migration-threshold 
to 3 to have it try to start on a different node after three failures,
or you can set on-fail to fence to have it fence the node if the
(first) start fails, but you can't combine those approaches.

It's a longstanding goal to allow more flexibility in failure handling,
but there hasn't been time to deal with it.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] how to disable pacemaker throttle mode

2024-02-05 Thread Ken Gaillot

On Mon, 2024-02-05 at 21:30 +0100, Vladislav Bogdanov wrote:
> IIRC, there is one issue with that, is that IO load is considered a
> CPU load, so on busy storage servers you get throttling with almost
> free CPU. I may be wrong that load is calculated from loadavg, which 

Yep, it checks the 1-minute average from /proc/loadavg (it also checks
the CIB manager separately using utime/stime from /proc/PID/stat)

> is a different story at all, as it indicates the number of processes
> which are ready to consume the CPU time, including those waiting for
> IOs to complete, but that is what my mind recalls.
> 
> I easily get loadavg of 128 on iscsi storage servers with almost free
> CPU, no thermal reaction at all.
> 
> Best,
> Vlad
> 
> On February 5, 2024 19:22:11 Ken Gaillot  wrote:
> 
> > On Mon, 2024-02-05 at 18:08 +0800, hywang via Users wrote:
> > > hello, everyone:
> > > Is there any way to disable pacemaker throttle mode. If there is,
> > > where to find it?
> > > Thanks!
> > > 
> > > 
> > 
> > You can influence it via the load-threshold and node-action-limit
> > cluster options.
> > 
> > The cluster throttles when CPU usage approaches load-threshold
> > (defaulting to 80%), and limits the number of simultaneous actions
> > on a
> > node to node-action-limit (defaulting to twice the number of
> > cores).
> > 
> > The node action limit can be overridden per node by setting the
> > PCMK_node_action_limit environment variable (typically in
> > /etc/sysconfig/pacemaker, /etc/default/pacemaker, etc. depending on
> > distro).
> > -- 
> > Ken Gaillot 
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> > 
> 
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] how to disable pacemaker throttle mode

2024-02-05 Thread Ken Gaillot

On Mon, 2024-02-05 at 18:08 +0800, hywang via Users wrote:
> hello, everyone:
> Is there any way to disable pacemaker throttle mode. If there is,
> where to find it?
> Thanks!
> 

You can influence it via the load-threshold and node-action-limit
cluster options.

The cluster throttles when CPU usage approaches load-threshold
(defaulting to 80%), and limits the number of simultaneous actions on a
node to node-action-limit (defaulting to twice the number of cores).

The node action limit can be overridden per node by setting the
PCMK_node_action_limit environment variable (typically in
/etc/sysconfig/pacemaker, /etc/default/pacemaker, etc. depending on
distro).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Gracefully Failing Live Migrations

2024-02-01 Thread Ken Gaillot

On Thu, 2024-02-01 at 12:57 -0600, Billy Croan wrote:
> How do I figure out which of the three steps failed and why?

They're normal resource actions: migrate_to, migrate_from, and stop.
You can investigate them in the usual way (status, logs).

> 
> On Thu, Feb 1, 2024 at 11:15 AM Ken Gaillot 
> wrote:
> > On Thu, 2024-02-01 at 10:20 -0600, Billy Croan wrote:
> > > Sometimes I've tried to move a resource from one node to another,
> > and
> > > it migrates live without a problem.  Other times I get 
> > > > Failed Resource Actions:
> > > > * vm_myvm_migrate_to_0 on node1 'unknown error' (1): call=102,
> > > > status=complete, exitreason='myvm: live migration to node2
> > failed:
> > > > 1',
> > > > last-rc-change='Sat Jan 13 09:13:31 2024', queued=1ms,
> > > > exec=35874ms
> > > > 
> > > 
> > > And I find out the live part of the migration failed, when the vm
> > > reboots and an (albeit minor) outage occurs.
> > > 
> > > Is there a way to configure pacemaker, so that if it is unable to
> > > migrate live it simply does not migrate at all?
> > > 
> > 
> > No. Pacemaker automatically replaces a required stop/start sequence
> > with live migration when possible. If there is a live migration
> > attempted, by definition the resource must move one way or another.
> > Also, live migration involves three steps, and if one of them
> > fails,
> > the resource is in an unknown state, so it must be restarted
> > anyway.
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Gracefully Failing Live Migrations

2024-02-01 Thread Ken Gaillot

On Thu, 2024-02-01 at 10:20 -0600, Billy Croan wrote:
> Sometimes I've tried to move a resource from one node to another, and
> it migrates live without a problem.  Other times I get 
> > Failed Resource Actions:
> > * vm_myvm_migrate_to_0 on node1 'unknown error' (1): call=102,
> > status=complete, exitreason='myvm: live migration to node2 failed:
> > 1',
> > last-rc-change='Sat Jan 13 09:13:31 2024', queued=1ms,
> > exec=35874ms
> > 
> 
> And I find out the live part of the migration failed, when the vm
> reboots and an (albeit minor) outage occurs.
> 
> Is there a way to configure pacemaker, so that if it is unable to
> migrate live it simply does not migrate at all?
> 

No. Pacemaker automatically replaces a required stop/start sequence
with live migration when possible. If there is a live migration
attempted, by definition the resource must move one way or another.
Also, live migration involves three steps, and if one of them fails,
the resource is in an unknown state, so it must be restarted anyway.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] trigger something at ?

2024-02-01 Thread Ken Gaillot

On Thu, 2024-02-01 at 14:31 +0100, lejeczek via Users wrote:
> 
> On 31/01/2024 18:11, Ken Gaillot wrote:
> > On Wed, 2024-01-31 at 16:37 +0100, lejeczek via Users wrote:
> > > On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote:
> > > > On Wed, 31 Jan 2024 16:02:12 +0100
> > > > lejeczek via Users  wrote:
> > > > 
> > > > > On 29/01/2024 17:22, Ken Gaillot wrote:
> > > > > > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users
> > > > > > wrote:
> > > > > > > Hi guys.
> > > > > > > 
> > > > > > > Is it possible to trigger some... action - I'm thinking
> > > > > > > specifically
> > > > > > > at shutdown/start.
> > > > > > > If not within the cluster then - if you do that - perhaps
> > > > > > > outside.
> > > > > > > I would like to create/remove constraints, when cluster
> > > > > > > starts &
> > > > > > > stops, respectively.
> > > > > > > 
> > > > > > > many thanks, L.
> > > > > > > 
> > > > > > You could use node status alerts for that, but it's risky
> > > > > > for
> > > > > > alert
> > > > > > agents to change the configuration (since that may result
> > > > > > in
> > > > > > more
> > > > > > alerts and potentially some sort of infinite loop).
> > > > > > 
> > > > > > Pacemaker has no concept of a full cluster start/stop, only
> > > > > > node
> > > > > > start/stop. You could approximate that by checking whether
> > > > > > the
> > > > > > node
> > > > > > receiving the alert is the only active node.
> > > > > > 
> > > > > > Another possibility would be to write a resource agent that
> > > > > > does what
> > > > > > you want and order everything else after it. However it's
> > > > > > even
> > > > > > more
> > > > > > risky for a resource agent to modify the configuration.
> > > > > > 
> > > > > > Finally you could write a systemd unit to do what you want
> > > > > > and
> > > > > > order it
> > > > > > after pacemaker.
> > > > > > 
> > > > > > What's wrong with leaving the constraints permanently
> > > > > > configured?
> > > > > yes, that would be for a node start/stop
> > > > > I struggle with using constraints to move pgsql (PAF) master
> > > > > onto a given node - seems that co/locating paf's master
> > > > > results in troubles (replication brakes) at/after node
> > > > > shutdown/reboot (not always, but way too often)
> > > > What? What's wrong with colocating PAF's masters exactly? How
> > > > does
> > > > it brake any
> > > > replication? What's these constraints you are dealing with?
> > > > 
> > > > Could you share your configuration?
> > > Constraints beyond/above of what is required by PAF agent
> > > itself, say...
> > > you have multiple pgSQL cluster with PAF - thus multiple
> > > (separate, for each pgSQL cluster) masters and you want to
> > > spread/balance those across HA cluster
> > > (or in other words - avoid having more that 1 pgsql master
> > > per HA node)
> > > These below, I've tried, those move the master onto chosen
> > > node but.. then the issues I mentioned.
> > > 
> > > -> $ pcs constraint location PGSQL-PAF-5438-clone prefers
> > > ubusrv1=1002
> > > or
> > > -> $ pcs constraint colocation set PGSQL-PAF-5435-clone
> > > PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master
> > > require-all=false setoptions score=-1000
> > > 
> > Anti-colocation sets tend to be tricky currently -- if the first
> > resource can't be assigned to a node, none of them can. We have an
> > idea
> > for a better implementation:
> > 
> >   https://projects.clusterlabs.org/T383
> > 
> > In the meantime, a possible workaround is to use placement-
> > strategy=balanced and define utilization for the clones only. The
> > promoted roles will each get a slight additional utilization, and
> > the
> > cluster should spread the

Re: [ClusterLabs] trigger something at ?

2024-01-31 Thread Ken Gaillot

On Wed, 2024-01-31 at 16:37 +0100, lejeczek via Users wrote:
> 
> On 31/01/2024 16:06, Jehan-Guillaume de Rorthais wrote:
> > On Wed, 31 Jan 2024 16:02:12 +0100
> > lejeczek via Users  wrote:
> > 
> > > On 29/01/2024 17:22, Ken Gaillot wrote:
> > > > On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote:
> > > > > Hi guys.
> > > > > 
> > > > > Is it possible to trigger some... action - I'm thinking
> > > > > specifically
> > > > > at shutdown/start.
> > > > > If not within the cluster then - if you do that - perhaps
> > > > > outside.
> > > > > I would like to create/remove constraints, when cluster
> > > > > starts &
> > > > > stops, respectively.
> > > > > 
> > > > > many thanks, L.
> > > > > 
> > > > You could use node status alerts for that, but it's risky for
> > > > alert
> > > > agents to change the configuration (since that may result in
> > > > more
> > > > alerts and potentially some sort of infinite loop).
> > > > 
> > > > Pacemaker has no concept of a full cluster start/stop, only
> > > > node
> > > > start/stop. You could approximate that by checking whether the
> > > > node
> > > > receiving the alert is the only active node.
> > > > 
> > > > Another possibility would be to write a resource agent that
> > > > does what
> > > > you want and order everything else after it. However it's even
> > > > more
> > > > risky for a resource agent to modify the configuration.
> > > > 
> > > > Finally you could write a systemd unit to do what you want and
> > > > order it
> > > > after pacemaker.
> > > > 
> > > > What's wrong with leaving the constraints permanently
> > > > configured?
> > > yes, that would be for a node start/stop
> > > I struggle with using constraints to move pgsql (PAF) master
> > > onto a given node - seems that co/locating paf's master
> > > results in troubles (replication brakes) at/after node
> > > shutdown/reboot (not always, but way too often)
> > What? What's wrong with colocating PAF's masters exactly? How does
> > it brake any
> > replication? What's these constraints you are dealing with?
> > 
> > Could you share your configuration?
> Constraints beyond/above of what is required by PAF agent 
> itself, say...
> you have multiple pgSQL cluster with PAF - thus multiple 
> (separate, for each pgSQL cluster) masters and you want to 
> spread/balance those across HA cluster
> (or in other words - avoid having more that 1 pgsql master 
> per HA node)
> These below, I've tried, those move the master onto chosen 
> node but.. then the issues I mentioned.
> 
> -> $ pcs constraint location PGSQL-PAF-5438-clone prefers 
> ubusrv1=1002
> or
> -> $ pcs constraint colocation set PGSQL-PAF-5435-clone 
> PGSQL-PAF-5434-clone PGSQL-PAF-5433-clone role=Master 
> require-all=false setoptions score=-1000
> 

Anti-colocation sets tend to be tricky currently -- if the first
resource can't be assigned to a node, none of them can. We have an idea
for a better implementation:

 https://projects.clusterlabs.org/T383

In the meantime, a possible workaround is to use placement-
strategy=balanced and define utilization for the clones only. The
promoted roles will each get a slight additional utilization, and the
cluster should spread them out across nodes whenever possible. I don't
know if that will avoid the replication issues but it may be worth a
try.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] controlling cluster behavior on startup

2024-01-30 Thread Ken Gaillot

On Tue, 2024-01-30 at 13:20 +, Walker, Chris wrote:
> >>> However, now it seems to wait that amount of time before it
> elects a
> >>> DC, even when quorum is acquired earlier.  In my log snippet
> below,
> >>> with dc-deadtime 300s,
> >>
> >> The dc-deadtime is not waiting for quorum, but for another DC to
> show
> >> up. If all nodes show up, it can proceed, but otherwise it has to
> wait.
> 
> > I believe all the nodes showed up by 14:17:04, but it still waited
> until 14:19:26 to elect a DC:
> 
> > Jan 29 14:14:25 gopher12 pacemaker-controld  [123697]
> (peer_update_callback)info: Cluster node gopher12 is now membe 
> (was in unknown state)
> > Jan 29 14:17:04 gopher12 pacemaker-controld  [123697]
> (peer_update_callback)info: Cluster node gopher11 is now membe 
> (was in unknown state)
> > Jan 29 14:17:04 gopher12 pacemaker-controld  [123697]
> (quorum_notification_cb)  notice: Quorum acquired | membership=54
> members=2
> > Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_log) 
> info: Input I_ELECTION_DC received in state S_ELECTION from
> election_win_cb
> 
> > This is a cluster with 2 nodes, gopher11 and gopher12.
> 
> This is our experience with dc-deadtime too: even if both nodes in
> the cluster show up, dc-deadtime must elapse before the cluster
> starts.  This was discussed on this list a while back (
> https://www.mail-archive.com/users@clusterlabs.org/msg03897.html) and
> an RFE came out of it (
> https://bugs.clusterlabs.org/show_bug.cgi?id=5310). 

Ah, I misremembered, I thought we had done that :(

>  
> I’ve worked around this by having an ExecStartPre directive for
> Corosync that does essentially:
>  
> while ! systemctl -H ${peer} is-active corosync; do sleep 5; done
>  
> With this in place, the nodes wait for each other before starting
> Corosync and Pacemaker.  We can then use the default 20s dc-deadtime
> so that the DC election happens quickly once both nodes are up.

That makes sense

> Thanks,
> Chris
>  
> From: Users  on behalf of Faaland,
> Olaf P. via Users 
> Date: Monday, January 29, 2024 at 7:46 PM
> To: Ken Gaillot , Cluster Labs - All topics
> related to open-source clustering welcomed 
> Cc: Faaland, Olaf P. 
> Subject: Re: [ClusterLabs] controlling cluster behavior on startup
> 
> >> However, now it seems to wait that amount of time before it elects
> a
> >> DC, even when quorum is acquired earlier.  In my log snippet
> below,
> >> with dc-deadtime 300s,
> >
> > The dc-deadtime is not waiting for quorum, but for another DC to
> show
> > up. If all nodes show up, it can proceed, but otherwise it has to
> wait.
> 
> I believe all the nodes showed up by 14:17:04, but it still waited
> until 14:19:26 to elect a DC:
> 
> Jan 29 14:14:25 gopher12 pacemaker-controld  [123697]
> (peer_update_callback)info: Cluster node gopher12 is now membe 
> (was in unknown state)
> Jan 29 14:17:04 gopher12 pacemaker-controld  [123697]
> (peer_update_callback)info: Cluster node gopher11 is now membe 
> (was in unknown state)
> Jan 29 14:17:04 gopher12 pacemaker-controld  [123697]
> (quorum_notification_cb)  notice: Quorum acquired | membership=54
> members=2
> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_log)  info:
> Input I_ELECTION_DC received in state S_ELECTION from election_win_cb
> 
> This is a cluster with 2 nodes, gopher11 and gopher12.
> 
> Am I misreading that?
> 
> thanks,
> Olaf
> 
> 
> From: Ken Gaillot 
> Sent: Monday, January 29, 2024 3:49 PM
> To: Faaland, Olaf P.; Cluster Labs - All topics related to open-
> source clustering welcomed
> Subject: Re: [ClusterLabs] controlling cluster behavior on startup
> 
> On Mon, 2024-01-29 at 22:48 +, Faaland, Olaf P. wrote:
> > Thank you, Ken.
> >
> > I changed my configuration management system to put an initial
> > cib.xml into /var/lib/pacemaker/cib/, which sets all the property
> > values I was setting via pcs commands, including dc-deadtime.  I
> > removed those "pcs property set" commands from the ones that are
> run
> > at startup time.
> >
> > That worked in the sense that after Pacemaker start, the node waits
> > my newly specified dc-deadtime of 300s before giving up on the
> > partner node and fencing it, if the partner never appears as a
> > member.
> >
> > However, now it seems to wait that amount of time before it elects
> a
> > DC, even when quorum is acquired earlier.  In my log snippet below,
> > with dc-deadtime 300s,
> 
> The dc-dead

Re: [ClusterLabs] controlling cluster behavior on startup

2024-01-29 Thread Ken Gaillot

On Mon, 2024-01-29 at 14:35 -0800, Reid Wahl wrote:
> 
> 
> On Monday, January 29, 2024, Ken Gaillot  wrote:
> > On Mon, 2024-01-29 at 18:05 +, Faaland, Olaf P. via Users
> wrote:
> >> Hi,
> >>
> >> I have configured clusters of node pairs, so each cluster has 2
> >> nodes.  The cluster members are statically defined in
> corosync.conf
> >> before corosync or pacemaker is started, and quorum {two_node: 1}
> is
> >> set.
> >>
> >> When both nodes are powered off and I power them on, they do not
> >> start pacemaker at exactly the same time.  The time difference may
> be
> >> a few minutes depending on other factors outside the nodes.
> >>
> >> My goals are (I call the first node to start pacemaker "node1"):
> >> 1) I want to control how long pacemaker on node1 waits before
> fencing
> >> node2 if node2 does not start pacemaker.
> >> 2) If node1 is part-way through that waiting period, and node2
> starts
> >> pacemaker so they detect each other, I would like them to proceed
> >> immediately to probing resource state and starting resources which
> >> are down, not wait until the end of that "grace period".
> >>
> >> It looks from the documentation like dc-deadtime is how #1 is
> >> controlled, and #2 is expected normal behavior.  However, I'm
> seeing
> >> fence actions before dc-deadtime has passed.
> >>
> >> Am I misunderstanding Pacemaker's expected behavior and/or how dc-
> >> deadtime should be used?
> >
> > You have everything right. The problem is that you're starting with
> an
> > empty configuration every time, so the default dc-deadtime is being
> > used for the first election (before you can set the desired value).
> 
> Why would there be fence actions before dc-deadtime expires though?

There isn't -- after the (default) dc-deadtime pops, the node elects
itself DC and runs the scheduler, which considers the other node unseen
and in need of startup fencing. The dc-deadtime has been raised in the
meantime, but that no longer matters.

> 
> >
> > I can't think of anything you can do to get around that, since the
> > controller starts the timer as soon as it starts up. Would it be
> > possible to bake an initial configuration into the PXE image?
> >
> > When the timer value changes, we could stop the existing timer and
> > restart it. There's a risk that some external automation could make
> > repeated changes to the timeout, thus never letting it expire, but
> that
> > seems preferable to your problem. I've created an issue for that:
> >
> >   https://projects.clusterlabs.org/T764
> >
> > BTW there's also election-timeout. I'm not sure offhand how that
> > interacts; it might be necessary to raise that one as well.
> >
> >>
> >> One possibly unusual aspect of this cluster is that these two
> nodes
> >> are stateless - they PXE boot from an image on another server -
> and I
> >> build the cluster configuration at boot time with a series of pcs
> >> commands, because the nodes have no local storage for this
> >> purpose.  The commands are:
> >>
> >> ['pcs', 'cluster', 'start']
> >> ['pcs', 'property', 'set', 'stonith-action=off']
> >> ['pcs', 'property', 'set', 'cluster-recheck-interval=60']
> >> ['pcs', 'property', 'set', 'start-failure-is-fatal=false']
> >> ['pcs', 'property', 'set', 'dc-deadtime=300']
> >> ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman',
> >> 'ip=192.168.64.65', 'pcmk_host_check=static-list',
> >> 'pcmk_host_list=gopher11,gopher12']
> >> ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman',
> >> 'ip=192.168.64.65', 'pcmk_host_check=static-list',
> >> 'pcmk_host_list=gopher11,gopher12']
> >> ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool',
> >> 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11',
> 'op',
> >> 'start', 'timeout=805']
> >> ...
> >> ['pcs', 'property', 'set', 'no-quorum-policy=ignore']
> >
> > BTW you don't need to change no-quorum-policy when you're using
> > two_node with Corosync.
> >
> >>
> >> I could, instead, generate a CIB so that when Pacemaker is
> started,
> >> it has a full config.  Is that better?
> >>
> >> thanks,
> >> Olaf
> >>
> >> === corosync.conf:
> >> totem {
> >> version: 2
> >> cluster_name: gopher11
> >> secauth: off
>

Re: [ClusterLabs] controlling cluster behavior on startup

2024-01-29 Thread Ken Gaillot

On Mon, 2024-01-29 at 22:48 +, Faaland, Olaf P. wrote:
> Thank you, Ken.
> 
> I changed my configuration management system to put an initial
> cib.xml into /var/lib/pacemaker/cib/, which sets all the property
> values I was setting via pcs commands, including dc-deadtime.  I
> removed those "pcs property set" commands from the ones that are run
> at startup time.
> 
> That worked in the sense that after Pacemaker start, the node waits
> my newly specified dc-deadtime of 300s before giving up on the
> partner node and fencing it, if the partner never appears as a
> member.
> 
> However, now it seems to wait that amount of time before it elects a
> DC, even when quorum is acquired earlier.  In my log snippet below,
> with dc-deadtime 300s,

The dc-deadtime is not waiting for quorum, but for another DC to show
up. If all nodes show up, it can proceed, but otherwise it has to wait.

> 
> 14:14:24 Pacemaker starts on gopher12
> 14:17:04 quorum is acquired
> 14:19:26 Election Trigger just popped (start time + dc-deadtime
> seconds)
> 14:19:26 gopher12 wins the election
> 
> Is there other configuration that needs to be present in the cib at
> startup time?
> 
> thanks,
> Olaf
> 
> === log extract using new system of installing partial cib.xml before
> startup
> Jan 29 14:14:24 gopher12 pacemakerd  [123690]
> (main)notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7
> features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default-
> concurrent-fencing generated-manpages monotonic nagios ncurses remote
> systemd
> Jan 29 14:14:25 gopher12 pacemaker-attrd [123695]
> (attrd_start_election_if_needed)  info: Starting an election to
> determine the writer
> Jan 29 14:14:25 gopher12 pacemaker-attrd [123695]
> (election_check)  info: election-attrd won by local node
> Jan 29 14:14:25 gopher12 pacemaker-controld  [123697]
> (peer_update_callback)info: Cluster node gopher12 is now member
> (was in unknown state)
> Jan 29 14:17:04 gopher12 pacemaker-controld  [123697]
> (quorum_notification_cb)  notice: Quorum acquired | membership=54
> members=2
> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]
> (crm_timer_popped)info: Election Trigger just popped |
> input=I_DC_TIMEOUT time=30ms
> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]
> (do_log)  warning: Input I_DC_TIMEOUT received in state S_PENDING
> from crm_timer_popped
> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]
> (do_state_transition) info: State transition S_PENDING ->
> S_ELECTION | input=I_DC_TIMEOUT cause=C_TIMER_POPPED
> origin=crm_timer_popped
> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]
> (election_check)  info: election-DC won by local node
> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697] (do_log)  info:
> Input I_ELECTION_DC received in state S_ELECTION from election_win_cb
> Jan 29 14:19:26 gopher12 pacemaker-controld  [123697]
> (do_state_transition) notice: State transition S_ELECTION ->
> S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL
> origin=election_win_cb
> Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696]
> (recurring_op_for_active) info: Start 10s-interval monitor
> for gopher11_zpool on gopher11
> Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696]
> (recurring_op_for_active) info: Start 10s-interval monitor
> for gopher12_zpool on gopher12
> 
> 
> === initial cib.xml contents
>  num_updates="0" admin_epoch="0" cib-last-written="Mon Jan 29 11:07:06
> 2024" update-origin="gopher12" update-client="root" update-
> user="root" have-quorum="0" dc-uuid="2">
>   
> 
>   
>  name="stonith-action" value="off"/>
> 
>     
>  name="cluster-infrastructure" value="corosync"/>
>  name="cluster-name" value="gopher11"/>
>  name="cluster-recheck-interval" value="60"/>
>  name="start-failure-is-fatal" value="false"/>
> 
>   
> 
> 
>   
>   
> 
> 
> 
>   
> 
> 
> 
> From: Ken Gaillot 
> Sent: Monday, January 29, 2024 10:51 AM
> To: Cluster Labs - All topics related to open-source clustering
> welcomed
> Cc: Faaland, Olaf P.
> Subject: Re: [ClusterLabs] controlling cluster behavior on startup
> 
> On Mon, 2024-01-29 at 18:05 +, Faaland, Olaf P. via Users wrote:
> > Hi,
> > 
> > I have configured clusters of node pairs, so each cluster has 2
> > nod

Re: [ClusterLabs] controlling cluster behavior on startup

2024-01-29 Thread Ken Gaillot

t; Jan 25 17:56:00 gopher12 pacemaker-controld  [116040]
> (crm_timer_popped)info: Election Trigger just popped |
> input=I_DC_TIMEOUT time=30ms
> Jan 25 17:56:01 gopher12 pacemaker-based [116035]
> (cib_perform_op)  info: ++
> /cib/configuration/crm_config/cluster_property_set[@id='cib-
> bootstrap-options']:  
> Jan 25 17:56:01 gopher12 pacemaker-controld  [116040]
> (abort_transition_graph)  info: Transition 0 aborted by cib-
> bootstrap-options-no-quorum-policy doing create no-quorum-
> policy=ignore: Configuration change | cib=0.26.0
> source=te_update_diff_v2:464
> path=/cib/configuration/crm_config/cluster_property_set[@id='cib-
> bootstrap-options'] complete=true
> Jan 25 17:56:01 gopher12 pacemaker-controld  [116040]
> (controld_execute_fence_action)   notice: Requesting fencing (off)
> targeting node gopher11 | action=11 timeout=60
> 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] trigger something at ?

2024-01-29 Thread Ken Gaillot

On Fri, 2024-01-26 at 13:55 +0100, lejeczek via Users wrote:
> Hi guys.
> 
> Is it possible to trigger some... action - I'm thinking specifically
> at shutdown/start.
> If not within the cluster then - if you do that - perhaps outside.
> I would like to create/remove constraints, when cluster starts &
> stops, respectively.
> 
> many thanks, L.
> 

You could use node status alerts for that, but it's risky for alert
agents to change the configuration (since that may result in more
alerts and potentially some sort of infinite loop).

Pacemaker has no concept of a full cluster start/stop, only node
start/stop. You could approximate that by checking whether the node
receiving the alert is the only active node.

Another possibility would be to write a resource agent that does what
you want and order everything else after it. However it's even more
risky for a resource agent to modify the configuration.

Finally you could write a systemd unit to do what you want and order it
after pacemaker.

What's wrong with leaving the constraints permanently configured?
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Planning for Pacemaker 3

2024-01-25 Thread Ken Gaillot

On Thu, 2024-01-25 at 10:31 +0100, Jehan-Guillaume de Rorthais wrote:
> On Wed, 24 Jan 2024 16:47:54 -0600
> Ken Gaillot  wrote:
> ...
> > > Erm. Well, as this is a major upgrade where we can affect
> > > people's
> > > conf and
> > > break old things & so on, I'll jump in this discussion with a
> > > wishlist to
> > > discuss :)
> > >   
> > 
> > I made sure we're tracking all these (links below),
> 
> Thank you Ken, for creating these tasks. I subscribed to them, but it
> seems I
> can not discuss on them (or maybe I failed to find how to do it).

Hmm, that's bad news. :( I don't immediately see a way to allow
comments without making the issue fully editable. Hopefully we can find
some configuration magic ...

> 
> > but realistically we're going to have our hands full dropping all
> > the
> > deprecated stuff in the time we have.
> 
> Let me know how I can help on these subject. Also, I'm still silently
> sitting on
> IRC chan if needed.
>
> 
> > Most of these can be done in any version.
> 
> Four out of seven can be done in any version. For the three other
> left, in my
> humble opinion and needs from the PAF agent point of view:
> 
> 1. «Support failure handling of notify actions»
>https://projects.clusterlabs.org/T759
> 2. «Change allowed range of scores and value of +/-INFINITY»
>https://projects.clusterlabs.org/T756
> 3. «Default to sending clone notifications when agent supports it»
>https://projects.clusterlabs.org/T758
> 
> The first is the most important as it allows to implement an actual
> election
> before the promotion, breaking the current transition if promotion
> score doesn't
> reflect the reality since last monitor action. Current PAF's code
> makes a lot of
> convolution to have a decent election mechanism preventing the
> promotion of a
> lagging node.
> 
> The second one would help removing some useless complexity from some
> resource
> agent code (at least in PAF).
> 
> The third one is purely for confort and cohesion between actions
> setup.
>
> Have a good day!
> 
> Regards,
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Planning for Pacemaker 3

2024-01-24 Thread Ken Gaillot

On Tue, 2024-01-23 at 18:49 +0100, Jehan-Guillaume de Rorthais wrote:
> Hi there !
> 
> On Wed, 03 Jan 2024 11:06:27 -0600
> Ken Gaillot  wrote:
> 
> > Hi all,
> > 
> > I'd like to release Pacemaker 3.0.0 around the middle of this
> > year. 
> > I'm gathering proposed changes here:
> > 
> >  
> > https://projects.clusterlabs.org/w/projects/pacemaker/pacemaker_3.0_changes/
> > 
> > Please review for anything that might affect you, and reply here if
> > you
> > have any concerns.
> 
> Erm. Well, as this is a major upgrade where we can affect people's
> conf and
> break old things & so on, I'll jump in this discussion with a
> wishlist to
> discuss :)
> 

I made sure we're tracking all these (links below), but realistically
we're going to have our hands full dropping all the deprecated stuff in
the time we have. Most of these can be done in any version.

> 1. "recover", "migration-to" and "migration-from" actions support ?
> 
>   See discussion:
>   
> https://lists.clusterlabs.org/pipermail/developers/2020-February/002258.html

https://projects.clusterlabs.org/T317

https://projects.clusterlabs.org/T755

> 
> 2.1. INT64 promotion scores?

https://projects.clusterlabs.org/T756

> 2.2. discovering promotion score ahead of promotion?

https://projects.clusterlabs.org/T505

> 2.3. make OCF_RESKEY_CRM_meta_notify_* or equivalent officially
> available in all
>  actions 
> 
>   See discussion:
>   
> https://lists.clusterlabs.org/pipermail/developers/2020-February/002255.html
> 

https://projects.clusterlabs.org/T757


> 3.1. deprecate "notify=true" clone option, make it true by default

https://projects.clusterlabs.org/T758

> 3.2. react to notify action return code
> 
>   See discussion:
>   
> https://lists.clusterlabs.org/pipermail/developers/2020-February/002256.html
> 

https://projects.clusterlabs.org/T759

> Off course, I can volunteer to help on some topics.
> 
> Cheers!
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] New ClusterLabs wiki

2024-01-23 Thread Ken Gaillot

Hi all,

The ClusterLabs project manager is now publicly viewable, without
needing a GitHub account:

  https://projects.clusterlabs.org/

Anyone can now follow issues tracked there. (Issues created before the
site was public will still require an account unless someone updates
their settings.)

The site has a simple built-in wiki, so to reduce sysadmin overhead, we
have moved the ClusterLabs wiki there:

  https://projects.clusterlabs.org/w/

The old wiki.clusterlabs.org site is gone, and redirects to the new
one. A lot of the wiki pages were more than a decade old, so they were
dropped if they didn't apply to current software and OSes.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Beginner lost with promotable "group" design

2024-01-17 Thread Ken Gaillot

On Wed, 2024-01-17 at 14:23 +0100, Adam Cécile wrote:
> Hello,
> 
> 
> I'm trying to achieve the following setup with 3 hosts:
> 
> * One master gets a shared IP, then remove default gw, add another
> gw, 
> start a service
> 
> * Two slaves should have none of them but add a different default gw
> 
> I managed quite easily to get the master workflow running with
> ordering 
> constraints but I don't understand how I should move forward with
> the 
> slave configuration.
> 
> I think I must create a promotable resource first then assign my
> other 
> resources with started/stopped  setting depending on the promote
> status 
> of the node. Is that correct ? How to create a promotable
> "placeholder" 
> where I can later attach my existing resources ?

A promotable resource would be appropriate if the service should run on
all nodes, but one node runs with a special setting. That doesn't sound
like what you have.

If you just need the service to run on one node, the shared IP,
service, and both gateways can be regular resources. You just need
colocation constraints between them:

- colocate service and external default route with shared IP
- clone the internal default route and anti-colocate it with shared IP

If you want the service to be able to run even if the IP can't, make
its colocation score finite (or colocate the IP and external route with
the service).

Ordering is separate. You can order the shared IP, service, and
external route however needed. Alternatively, you can put the three of
them in a group (which does both colocation and ordering, in sequence),
and anti-colocate the cloned internal route with the group.

> 
> Sorry for the stupid question but I really don't understand what type
> of 
> elements I should create...
> 
> 
> Thanks in advance,
> 
> Regards, Adam.
> 
> 
> PS: Bonus question should I use "pcs" or "crm" ? It seems both
> command 
> seem to be equivalent and documentations use sometime one or another
> 

They are equivalent -- it's a matter of personal preference (and often
what choices your distro give you).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Migrating off CentOS

2024-01-15 Thread Ken Gaillot

On Sat, 2024-01-13 at 09:07 -0600, Billy Croan wrote:
> I'm planning to migrate a two-node cluster off CentOS 7 this year.  I
> think I'm taking it to Debian Stable, but open for suggestions if any
> distribution is better supported by pacemaker.


Debian, RHEL, SUSE, Ubuntu, and compatible distros should all have good
support.

Fedora and FreeBSD get regular builds and basic testing but have fewer
users exercising them in production.

FYI, if you want to keep the interfaces you're familiar with, the free
RHEL developer license now allows most personal and small-business
production use: https://access.redhat.com/discussions/5719451

> 
> Have any of you had success doing major upgrades (bullseye to
> bookworm on Debian) of your physical nodes one at a time while each
> node is in standby+maintenance, and rolling the vm from one to the
> other so it doesn't reboot while the hosts are upgraded?  That has
> worked well for me for minor OS updates, but I'm curious about the
> majors.  
> 
> My project this year is even more major, not just upgrading the OS
> but changing distributions.
> 
> I think I have three possible ways I can try this:
> 1) wipe all server disks and start fresh.

A variation, if you can get new hosts, is to set up a test cluster on
new hosts, and once you're comfortable that it will work, stop the old
cluster and turn the new one into production.

> 
> 2) standby and maintenance one node, then reinstall it with a new OS
> and make a New Cluster.  shutdown the vm and copy it, offline, to the
> new one-node cluster. and start it up there. Then once that's
> working, wipe and reinstall the other node, and add it to the new
> cluster.

This should be fine.

> 
> 3) standby and maintenance one node, then Remove it from the
> cluster.  Then reinstall it with the new distribution's OS.  Then re-
> add it to the Existing Cluster.  Move the vm resource to it and
> verify it's working, then do the same with the other physical node,
> and take it out of standby to finish.
> 

This would be fine as long as the corosync and pacemaker versions are
compatible. However as Michele mentioned, RHEL 7 uses Corosync 2, and
the latest of any distro will use Corosync 3, so that will sink this
option.

> (Obviously any of those methods begin with a full backup to offsite
> and local media. and end with a verification against that backup.)
> 
> #1 would be the longest outage but the "cleanest result"
> #3 would be possibly no outage, but I think the least likely to
> work.  I understand EL uses pcs and debian uses crm for example...

Debian offers both IIRC. But that won't affect the upgrade, they both
use Pacemaker command-line tools under the hood. The only difference is
what commands you run to get the same effect.

> #2 is a compromise that should(tm) have only a few seconds of
> outage.  But could blow up i suppose.  They all could blow up though
> so I'm not sure that should play a factor in the decision.
> 
> I can't be the first person to go down this path.  So what do you all
> think?  how have you done it in the past?

-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Planning for Pacemaker 3

2024-01-04 Thread Ken Gaillot

Thanks, I hadn't heard that!

On Thu, 2024-01-04 at 01:13 +0100, Valentin Vidić via Users wrote:
> On Wed, Jan 03, 2024 at 11:06:27AM -0600, Ken Gaillot wrote:
> > I'd like to release Pacemaker 3.0.0 around the middle of this
> > year. 
> > I'm gathering proposed changes here:
> > 
> >  
> > https://projects.clusterlabs.org/w/projects/pacemaker/pacemaker_3.0_changes/
> > 
> > Please review for anything that might affect you, and reply here if
> > you
> > have any concerns.
> 
> Probably best to drop support for rkt bundles as that project has
> ended:
> 
>   https://github.com/rkt/rkt/issues/4024
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Planning for Pacemaker 3

2024-01-03 Thread Ken Gaillot

Hi all,

I'd like to release Pacemaker 3.0.0 around the middle of this year. 
I'm gathering proposed changes here:

 https://projects.clusterlabs.org/w/projects/pacemaker/pacemaker_3.0_changes/

Please review for anything that might affect you, and reply here if you
have any concerns.

Pacemaker major-version releases drop support for deprecated features,
to make the code easier to maintain. The biggest planned changes are
dropping support for Upstart and Nagios resources, as well as rolling
upgrades from Pacemaker 1. Much of the lowest-level public C API will
be dropped.

Because the changes will be backward-incompatible, we will continue to
make 2.1 releases for a few years, with backports of compatible fixes,
to help distribution packagers who need to keep backward compatibility.
-- 
Ken Gaillot 




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] colocate Redis - weird

2024-01-01 Thread Ken Gaillot

On Wed, 2023-12-20 at 11:16 +0100, lejeczek via Users wrote:
> 
> 
> On 19/12/2023 19:13, lejeczek via Users wrote:
> > hi guys,
> > 
> > Is this below not the weirdest thing?
> > 
> > -> $ pcs constraint ref PGSQL-PAF-5435
> > Resource: PGSQL-PAF-5435
> >   colocation-HA-10-1-1-84-PGSQL-PAF-5435-clone-INFINITY
> >   colocation-REDIS-6385-clone-PGSQL-PAF-5435-clone-INFINITY
> >   order-PGSQL-PAF-5435-clone-HA-10-1-1-84-Mandatory
> >   order-PGSQL-PAF-5435-clone-HA-10-1-1-84-Mandatory-1
> >   colocation_set_PePePe

Can you show the actual constraint information (resources and scores)
for the whole cluster? In particular I'm wondering about that set.

> > 
> > Here Redis master should folow pgSQL master.
> > Which such constraint:
> > 
> > -> $ pcs resource status PGSQL-PAF-5435
> >   * Clone Set: PGSQL-PAF-5435-clone [PGSQL-PAF-5435] (promotable):
> > * Promoted: [ ubusrv1 ]
> > * Unpromoted: [ ubusrv2 ubusrv3 ]
> > -> $ pcs resource status REDIS-6385-clone
> >   * Clone Set: REDIS-6385-clone [REDIS-6385] (promotable):
> > * Unpromoted: [ ubusrv1 ubusrv2 ubusrv3 ]
> > 
> > If I remove that constrain:
> > -> $ pcs constraint delete colocation-REDIS-6385-clone-PGSQL-PAF-
> > 5435-clone-INFINITY
> > -> $ pcs resource status REDIS-6385-clone
> >   * Clone Set: REDIS-6385-clone [REDIS-6385] (promotable):
> > * Promoted: [ ubusrv1 ]
> > * Unpromoted: [ ubusrv2 ubusrv3 ]
> > 
> > and ! I can manually move Redis master around, master moves to each
> > server just fine.
> > I again, add that constraint:
> > 
> > -> $ pcs constraint colocation add master REDIS-6385-clone with
> > master PGSQL-PAF-5435-clone
> > 
> > and the same...
> > 
> > 
>  What there might be about that one node - resource removed, created
> anew and cluster insists on keeping master there.
> I can manually move the master anywhere but if I _clear_ the
> resource, no constraints then cluster move it back to the same node.
> 
> I wonder about:  a) "transient" node attrs & b) if this cluster is
> somewhat broken.
> On a) - can we read more about those somewhere?(not the
> code/internals)
> thanks, L.
> 

Transient attributes are the same as permanent ones except they get
cleared when a node leaves the cluster.

The constraint says that the masters must be located together, but they
each still need to be enabled on a given node with either a master
score attribute (permanent or transient) or a location constraint.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] colocation constraint - do I get it all wrong?

2024-01-01 Thread Ken Gaillot

On Fri, 2023-12-22 at 17:02 +0100, lejeczek via Users wrote:
> hi guys.
> 
> I have a colocation constraint:
> 
> -> $ pcs constraint ref DHCPD
> Resource: DHCPD
>   colocation-DHCPD-GATEWAY-NM-link-INFINITY
> 
> and the trouble is... I thought DHCPD is to follow GATEWAY-NM-link,
> always!
> If that is true that I see very strange behavior, namely.
> When there is an issue with DHCPD resource, cannot be started, then
> GATEWAY-NM-link gets tossed around by the cluster.
> 
> Is that normal & expected - is my understanding of _colocation_
> completely wrong - or my cluster is indeed "broken"?
> many thanks, L.
> 

Pacemaker considers the preferences of colocated resources when
assigning a resource to a node, to ensure that as many resources as
possible can run. So if a colocated resource becomes unable to run on a
node, the primary resource might move to allow the colocated resource
to run.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.7 final release now available

2023-12-19 Thread Ken Gaillot

Hi all,

Source code for Pacemaker version 2.1.7 is available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.7

This is primarily a bug fix release. See the ChangeLog or the link
above for details.

Many thanks to all contributors of source code to this release,
including Chris Lumens, Gao,Yan, Grace Chin, Hideo Yamauchi, Jan
Pokorný, Ken Gaillot, liupei, Oyvind Albrigtsen, Reid Wahl, xin liang,
and xuezhixin.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Build cluster one node at a time

2023-12-19 Thread Ken Gaillot

Correct. You want to enable pcsd to start at boot. Also, after starting
pcsd the first time on a node, authorize it from the first node with
"pcs host auth  -u hacluster".

On Tue, 2023-12-19 at 22:42 +0200, Tiaan Wessels wrote:
> So i run the pcs add command for every new node on the first original
> node, not on the node being added? Only corosync, pacemaker and pcsd
> needs to run on the node to be added and the commands being run on
> the original node will speak to these on the new node?
> 
> On Tue, 19 Dec 2023, 21:39 Ken Gaillot,  wrote:
> > On Tue, 2023-12-19 at 17:03 +0200, Tiaan Wessels wrote:
> > > Hi,
> > > Is it possible to build a corosync pacemaker cluster on redhat9
> > one
> > > node at a time? In other words, when I'm finished with the first
> > node
> > > and reboot it, all services are started on it. Then i build a
> > second
> > > node to integrate into the cluster and once done, pcs status
> > shows
> > > two nodes on-line ?
> > > Thanks 
> > 
> > Yes, you can use pcs cluster setup with the first node, then pcs
> > cluster node add for each additional node.
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Build cluster one node at a time

2023-12-19 Thread Ken Gaillot

On Tue, 2023-12-19 at 17:03 +0200, Tiaan Wessels wrote:
> Hi,
> Is it possible to build a corosync pacemaker cluster on redhat9 one
> node at a time? In other words, when I'm finished with the first node
> and reboot it, all services are started on it. Then i build a second
> node to integrate into the cluster and once done, pcs status shows
> two nodes on-line ?
> Thanks 

Yes, you can use pcs cluster setup with the first node, then pcs
cluster node add for each additional node.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] cluster doesn't do HA as expected, pingd doesn't help

2023-12-18 Thread Ken Gaillot

s lustre-mds1 lustre-mds2 lustre{1..2}; do pcs
> constraint location OST4 avoids $i; done
> # pcs resource create ping ocf:pacemaker:ping dampen=5s
> host_list=192.168.34.250 op monitor interval=3s timeout=7s meta
> target-role="started" globally-unique="false" clone
> # for i in lustre-mgs lustre-mds{1..2} lustre{1..4}; do pcs
> constraint location ping-clone prefers $i; done
> # pcs constraint location OST3 rule score=0 pingd lt 1 or not_defined
> pingd
> # pcs constraint location OST4 rule score=0 pingd lt 1 or not_defined
> pingd
> # pcs constraint location OST3 rule score=125 defined pingd
> # pcs constraint location OST4 rule score=125 defined pingd
> 
> ###  same home base:
> # crm_simulate --simulate --live-check --show-scores
> pcmk__primitive_assign: OST4 allocation score on lustre3: 90
> pcmk__primitive_assign: OST4 allocation score on lustre4: 210
> # pcs status
>   * OST3(ocf::lustre:Lustre):Started lustre3
>   * OST4(ocf::lustre:Lustre):Started lustre4
> 
> ### VM with lustre4 (OST4) is OFF. 
> 
> # crm_simulate --simulate --live-check --show-scores
> pcmk__primitive_assign: OST4 allocation score on lustre3: 90
> pcmk__primitive_assign: OST4 allocation score on lustre4: 100
> Start  OST4( lustre3 )
> Resource action: OST4start on lustre3
> Resource action: OST4monitor=2 on lustre3
> # pcs status
>   * OST3(ocf::lustre:Lustre):Started lustre3
>   * OST4(ocf::lustre:Lustre):Stopped
> 
> Again lustre3 seems unable to overrule due to lower score and pingd
> DOESN'T help at all!
> 
> 
> 4) Can I make a reliable HA failover without pingd to keep things as
> simple as possible?
> 5) Pings might help to affect cluster decisions in case GW is lost,
> but its not working as all the guides say. Why?
> 
> 
> Thanks in advance,
> Artem
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.7-rc4 now available (likely final for real)

2023-12-12 Thread Ken Gaillot

Hi all,

Source code for the fourth (and very likely final) release candidate
for Pacemaker version 2.1.7 is available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.7-rc4

This release candidate fixes a newly found regression that was
introduced in rc1.

This is probably your last chance to test before the final release,
which I expect will be next Tuesday.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] ocf:pacemaker:ping works strange

2023-12-12 Thread Ken Gaillot

On Tue, 2023-12-12 at 18:08 +0300, Artem wrote:
> Hi Andrei. pingd==0 won't satisfy both statements. It would if I used
> GTE, but I used GT.
> pingd lt 1 --> [0]
> pingd gt 0 --> [1,2,3,...]

It's the "or defined pingd" part of the rule that will match pingd==0.
A value of 0 is defined.

I'm guessing you meant to use "pingd gt 0 *AND* pingd defined", but
then the defined part would become redundant since any value greater
than 0 is inherently defined. So, for that rule, you only need "pingd
gt 0".

> 
> On Tue, 12 Dec 2023 at 17:21, Andrei Borzenkov 
> wrote:
> > On Tue, Dec 12, 2023 at 4:47 PM Artem  wrote:
> > >> > pcs constraint location FAKE3 rule score=0 pingd lt 1 or
> > not_defined pingd
> > >> > pcs constraint location FAKE3 rule score=125 pingd gt 0 or
> > defined pingd
> > > Are they really contradicting?
> > 
> > Yes. pingd == 0 will satisfy both rules. My use of "always" was
> > incorrect, it does not happen for all possible values of pingd, but
> > it
> > does happen for some.
> 
> May be defined/not_defined should be put in front of lt/gt ? It is
> possible that VM goes down, pingd to not_defined, then the rule
> evaluates "lt 1" first, catches an error and doesn't evaluate next
> part (after OR)?

No, the order of and/or clauses doesn't matter.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] ocf:pacemaker:ping works strange

2023-12-12 Thread Ken Gaillot

On Mon, 2023-12-11 at 21:05 +0300, Artem wrote:
> Hi Ken,
> 
> On Mon, 11 Dec 2023 at 19:00, Ken Gaillot 
> wrote:
> > > Question #2) I shut lustre3 VM down and leave it like that
> > How did you shut it down? Outside cluster control, or with
> > something
> > like pcs resource disable?
> > 
> 
> I did it outside of the cluster to simulate a failure. I turned off
> this VM from vCenter. Cluster is unaware of anything behind OS.

In that case check pacemaker.log for messages around the time of the
failure. They should tell you what error originally occurred and why
the cluster is blocked on it.

>  
> > >   * FAKE3   (ocf::pacemaker:Dummy):  Stopped
> > >   * FAKE4   (ocf::pacemaker:Dummy):  Started lustre4
> > >   * Clone Set: ping-clone [ping]:
> > > * Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1
> > lustre2
> > > lustre4 ] << lustre3 missing
> > > OK for now
> > > VM boots up. pcs status: 
> > >   * FAKE3   (ocf::pacemaker:Dummy):  FAILED (blocked) [
> > lustre3
> > > lustre4 ]  << what is it?
> > >   * Clone Set: ping-clone [ping]:
> > > * ping  (ocf::pacemaker:ping):   FAILED lustre3
> > (blocked)   
> > > << why not started?
> > > * Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1
> > lustre2
> > > lustre4 ]
> > > I checked server processes manually and found that lustre4 runs
> > > "/usr/lib/ocf/resource.d/pacemaker/ping monitor" while lustre3
> > > doesn't
> > > All is according to documentation but results are strange.
> > > Then I tried to add meta target-role="started" to pcs resource
> > create
> > > ping and this time ping started after node rebooted. Can I expect
> > > that it was just missing from official setup documentation, and
> > now
> > > everything will work fine?
> > 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] resource fails manual failover

2023-12-12 Thread Ken Gaillot

On Tue, 2023-12-12 at 16:50 +0300, Artem wrote:
> Is there a detailed explanation for resource monitor and start
> timeouts and intervals with examples, for dummies?

No, though Pacemaker Explained has some reference information:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#resource-operations

> 
> my resource configured s follows:
> [root@lustre-mds1 ~]# pcs resource show MDT00
> Warning: This command is deprecated and will be removed. Please use
> 'pcs resource config' instead.
> Resource: MDT00 (class=ocf provider=heartbeat type=Filesystem)
>   Attributes: MDT00-instance_attributes
> device=/dev/mapper/mds00
> directory=/lustre/mds00
> force_unmount=safe
> fstype=lustre
>   Operations:
> monitor: MDT00-monitor-interval-20s
>   interval=20s
>   timeout=40s
> start: MDT00-start-interval-0s
>   interval=0s
>   timeout=60s
> stop: MDT00-stop-interval-0s
>   interval=0s
>   timeout=60s
> 
> I issued manual failover with the following commands:
> crm_resource --move -r MDT00 -H lustre-mds1
> 
> resource tried but returned back with the entries in pacemaker.log
> like these:
> Dec 12 15:53:23  Filesystem(MDT00)[1886100]:INFO: Running start
> for /dev/mapper/mds00 on /lustre/mds00
> Dec 12 15:53:45  Filesystem(MDT00)[1886100]:ERROR: Couldn't mount
> device [/dev/mapper/mds00] as /lustre/mds00
> 
> tried again with the same result:
> Dec 12 16:11:04  Filesystem(MDT00)[1891333]:INFO: Running start
> for /dev/mapper/mds00 on /lustre/mds00
> Dec 12 16:11:26  Filesystem(MDT00)[1891333]:ERROR: Couldn't mount
> device [/dev/mapper/mds00] as /lustre/mds00
> 
> Why it cannot move?

The error is outside the cluster software, in the mount attempt itself.
The resource agent logged the ERROR above, so if you can't find more
information in the system logs you may want to look at the agent code
to see what it's doing around that message.

> 
> Does this 20 sec interval (between start and error) have anything to
> do with monitor interval settings?

No. The monitor interval says when to schedule another recurring
monitor check after the previous one completes. The first monitor isn't
scheduled until after the start succeeds.

> 
> [root@lustre-mgs ~]# pcs constraint show --full
> Location Constraints:
>   Resource: MDT00
> Enabled on:
>   Node: lustre-mds1 (score:100) (id:location-MDT00-lustre-mds1-
> 100)
>   Node: lustre-mds2 (score:100) (id:location-MDT00-lustre-mds2-
> 100)
> Disabled on:
>   Node: lustre-mgs (score:-INFINITY) (id:location-MDT00-lustre-
> mgs--INFINITY)
>   Node: lustre1 (score:-INFINITY) (id:location-MDT00-lustre1
> --INFINITY)
>   Node: lustre2 (score:-INFINITY) (id:location-MDT00-lustre2
> --INFINITY)
>   Node: lustre3 (score:-INFINITY) (id:location-MDT00-lustre3
> --INFINITY)
>   Node: lustre4 (score:-INFINITY) (id:location-MDT00-lustre4
> --INFINITY)
> Ordering Constraints:
>   start MGT then start MDT00 (kind:Optional) (id:order-MGT-MDT00-
> Optional)
>   start MDT00 then start OST1 (kind:Optional) (id:order-MDT00-OST1-
> Optional)
>   start MDT00 then start OST2 (kind:Optional) (id:order-MDT00-OST2-
> Optional)
> 
> with regards to ordering constraint: OST1 and OST2 are started now,
> while I'm exercising MDT00 failover.
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] ocf:pacemaker:ping works strange

2023-12-11 Thread Ken Gaillot

On Fri, 2023-12-08 at 17:44 +0300, Artem wrote:
> Hello experts.
> 
> I use pacemaker for a Lustre cluster. But for simplicity and
> exploration I use a Dummy resource. I didn't like how resource
> performed failover and failback. When I shut down VM with remote
> agent, pacemaker tries to restart it. According to pcs status it
> marks the resource (not RA) Online for some time while VM stays
> down. 
> 
> OK, I wanted to improve its behavior and set up a ping monitor. I
> tuned the scores like this:
> pcs resource create FAKE3 ocf:pacemaker:Dummy
> pcs resource create FAKE4 ocf:pacemaker:Dummy
> pcs constraint location FAKE3 prefers lustre3=100
> pcs constraint location FAKE3 prefers lustre4=90
> pcs constraint location FAKE4 prefers lustre3=90
> pcs constraint location FAKE4 prefers lustre4=100
> pcs resource defaults update resource-stickiness=110
> pcs resource create ping ocf:pacemaker:ping dampen=5s host_list=local
> op monitor interval=3s timeout=7s clone meta target-role="started"
> for i in lustre{1..4}; do pcs constraint location ping-clone prefers
> $i; done
> pcs constraint location FAKE3 rule score=0 pingd lt 1 or not_defined
> pingd
> pcs constraint location FAKE4 rule score=0 pingd lt 1 or not_defined
> pingd
> pcs constraint location FAKE3 rule score=125 pingd gt 0 or defined
> pingd
> pcs constraint location FAKE4 rule score=125 pingd gt 0 or defined
> pingd

The gt 0 part is redundant since "defined pingd" matches *any* score.

> 
> 
> Question #1) Why I cannot see accumulated score from pingd in
> crm_simulate output? Only location score and stickiness. 
> pcmk__primitive_assign: FAKE3 allocation score on lustre3: 210
> pcmk__primitive_assign: FAKE3 allocation score on lustre4: 90
> pcmk__primitive_assign: FAKE4 allocation score on lustre3: 90
> pcmk__primitive_assign: FAKE4 allocation score on lustre4: 210
> Either when all is OK or when VM is down - score from pingd not added
> to total score of RA

ping scores aren't added to resource scores, they're just set as node
attribute values. Location constraint rules map those values to
resource scores (in this case any defined ping score gets mapped to
125).

> 
> 
> Question #2) I shut lustre3 VM down and leave it like that. pcs
> status:

How did you shut it down? Outside cluster control, or with something
like pcs resource disable?

>   * FAKE3   (ocf::pacemaker:Dummy):  Stopped
>   * FAKE4   (ocf::pacemaker:Dummy):  Started lustre4
>   * Clone Set: ping-clone [ping]:
> * Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1 lustre2
> lustre4 ] << lustre3 missing
> OK for now
> VM boots up. pcs status: 
>   * FAKE3   (ocf::pacemaker:Dummy):  FAILED (blocked) [ lustre3
> lustre4 ]  << what is it?
>   * Clone Set: ping-clone [ping]:
> * ping  (ocf::pacemaker:ping):   FAILED lustre3 (blocked)   
> << why not started?
> * Started: [ lustre-mds1 lustre-mds2 lustre-mgs lustre1 lustre2
> lustre4 ]
> I checked server processes manually and found that lustre4 runs
> "/usr/lib/ocf/resource.d/pacemaker/ping monitor" while lustre3
> doesn't
> All is according to documentation but results are strange.
> Then I tried to add meta target-role="started" to pcs resource create
> ping and this time ping started after node rebooted. Can I expect
> that it was just missing from official setup documentation, and now
> everything will work fine?
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.7-rc3 now available (likely final)

2023-12-07 Thread Ken Gaillot

Hi all,

Source code for the third (and likely final) release candidate for
Pacemaker version 2.1.7 is available at:

 
https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.7-rc3

This release candidate fixes a couple issues introduced in rc1. See the
ChangeLog or the link above for details.

Everyone is encouraged to download, build, and test the new release. We
do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

This is probably your last chance to test before the final release,
which I expect in about two weeks. If anyone needs more time, let me
know and I can delay it till early January.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Prevent cluster transition when resource unavailable on both nodes

2023-12-06 Thread Ken Gaillot

On Wed, 2023-12-06 at 17:55 +0100, Alexander Eastwood wrote:
> Hello, 
> 
> I administrate a Pacemaker cluster consisting of 2 nodes, which are
> connected to each other via ethernet cable to ensure that they are
> always able to communicate with each other. A network switch is also
> connected to each node via ethernet cable and provides external
> access.
> 
> One of the managed resources of the cluster is a virtual IP, which is
> assigned to a physical network interface card and thus depends on the
> network switch being available. The virtual IP is always hosted on
> the active node.
> 
> We had the situation where the network switch lost power or was
> rebooted, as a result both servers reported `NIC Link is Down`. The
> recover operation on the Virtual IP resource then failed repeatedly
> on the active node, and a transition was initiated. Since the other 

The default reaction to a start failure is to ban the resource from
that node. If it tries to recover repeatedly on the same node, I assume
you set start-failure-is-fatal to false, and/or have a very low
failure-timeout on starts?

> node was also unable to start the resource, the cluster was swaying
> between the 2 nodes until the NIC links were up again.
> 
> Is there a way to change this behaviour? I am thinking of the
> following sequence of events, but have not been able to find a way to
> configure this:
> 
>  1. active node detects NIC Link is Down, which affects a resource
> managed by the cluster (monitor operation on the resource starts to
> fail)
>  2. active node checks if the other (passive) node in the cluster
> would be able to start the resource

There's really no way to check without actually trying to start it, so
basically you're describing what Pacemaker does.

>  3. if passive node can start the resource, transition all resources
> to passive node

I think maybe the "all resources" part is key. Presumably that means
you have a bunch of other resources colocated with and/or ordered after
the IP, so they all have to stop to try to start the IP elsewhere.

If those resources really do require the IP to be active, then that's
the correct behavior. If they don't, then the constraints could be
dropped, reversed, or made optional or asymmetric.

It sounds like you might want an optional colocation, or a colocation
of the IP with the other resources (rather than vice versa).

>  4. if passive node is unable to start the resource, then there is
> nothing to be gained a transition, so no action should be taken

If start-failure-is-fatal is left to true, and no failure-timeout is
configured, then it will try once per node then wait for manual
cleanup. If the colocation is made optional or reversed, the other
resources can continue to run.

> 
> Any pointers or advice will be much appreciated!
> 
> Thank you and kind regards,
> 
> Alex Eastwood
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Redundant entries in log

2023-12-05 Thread Ken Gaillot

On Tue, 2023-12-05 at 17:21 +, Jean-Baptiste Skutnik wrote:
> Hi,
> 
> It was indeed a configuration of 1m on the recheck interval that
> triggered the transitions.
> 
> Could you elaborate on why this is not relevant anymore ? I am
> training
> on the HA stack and if there are mechanisms to detect failure more
> advanced than a recheck I would be interested in what to look for in
> the documentation.

Hi,

The recheck interval has nothing to do with detecting resource failures
-- that is done per-resource via the configured monitor operation
interval.

In the past, time-based configuration such as failure timeouts and
date/time-based rules were only guaranteed to be checked as often as
the recheck interval. That was the most common reason why people
lowered it. However, since the 2.0.3 release, these are checked at the
exact appropriate time, so the recheck interval is no longer relevant
for these.

The recheck interval is still useful in two situations: evaluation of
rules using the (cron-like) date_spec element is still only guaranteed
to occur this often; and if there are scheduler bugs resulting in an
incompletely scheduled transition that can be corrected with a new
transition, this will be the maximum time until that happens.

> 
> Cheers,
> 
> JB
> 
> > On Nov 29, 2023, at 18:52, Ken Gaillot  wrote:
> > 
> > Hi,
> > 
> > Something is triggering a new transition. The most likely candidate
> > is
> > a low value for cluster-recheck-interval.
> > 
> > Many years ago, a low cluster-recheck-interval was necessary to
> > make
> > certain things like failure-timeout more timely, but that has not
> > been
> > the case in a long time. It should be left to default (15 minutes)
> > in
> > the vast majority of cases. (A new transition will still occur on
> > that
> > schedule, but that's reasonable.)
> > 
> > On Wed, 2023-11-29 at 10:05 +, Jean-Baptiste Skutnik via Users
> > wrote:
> > > Hello all,
> > > 
> > > I am managing a cluster using pacemaker for high availability. I
> > > am
> > > parsing the logs for relevant information on the cluster health
> > > and
> > > the logs are full of the following:
> > > 
> > > ```
> > > Nov 29 09:17:41 esvm2 pacemaker-controld[2893]:  notice: State
> > > transition S_IDLE -> S_POLICY_ENGINE
> > > Nov 29 09:17:41 esvm2 pacemaker-schedulerd[2892]:  notice:
> > > Calculated
> > > transition 8629, saving inputs in /var/lib/pacemaker/pengine/pe-
> > > input-250.bz2
> > > Nov 29 09:17:41 esvm2 pacemaker-controld[2893]:  notice:
> > > Transition
> > > 8629 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> > > Source=/var/lib/pacemaker/pengine/pe-input-250.bz2): Complete
> > > Nov 29 09:17:41 esvm2 pacemaker-controld[2893]:  notice: State
> > > transition S_TRANSITION_ENGINE -> S_IDLE
> > > Nov 29 09:18:41 esvm2 pacemaker-controld[2893]:  notice: State
> > > transition S_IDLE -> S_POLICY_ENGINE
> > > Nov 29 09:18:41 esvm2 pacemaker-schedulerd[2892]:  notice:
> > > Calculated
> > > transition 8630, saving inputs in /var/lib/pacemaker/pengine/pe-
> > > input-250.bz2
> > > Nov 29 09:18:41 esvm2 pacemaker-controld[2893]:  notice:
> > > Transition
> > > 8630 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> > > Source=/var/lib/pacemaker/pengine/pe-input-250.bz2): Complete
> > > Nov 29 09:18:41 esvm2 pacemaker-controld[2893]:  notice: State
> > > transition S_TRANSITION_ENGINE -> S_IDLE
> > > Nov 29 09:19:41 esvm2 pacemaker-controld[2893]:  notice: State
> > > transition S_IDLE -> S_POLICY_ENGINE
> > > Nov 29 09:19:41 esvm2 pacemaker-schedulerd[2892]:  notice:
> > > Calculated
> > > transition 8631, saving inputs in /var/lib/pacemaker/pengine/pe-
> > > input-250.bz2
> > > Nov 29 09:19:41 esvm2 pacemaker-controld[2893]:  notice:
> > > Transition
> > > 8631 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> > > Source=/var/lib/pacemaker/pengine/pe-input-250.bz2): Complete
> > > Nov 29 09:19:41 esvm2 pacemaker-controld[2893]:  notice: State
> > > transition
> > > ...
> > > ```
> > > 
> > > The transition IDs seem to differ however the file containing the
> > > transition data stays the same, implying that the transition does
> > > not
> > > affect the cluster. (/var/lib/pacemaker/pengine/pe-input-250.bz2)
> > > 
> > > I noticed the option to restrict the logging to higher levels
> > > however
> > > some valuable information is logged under the `notice` level and
> > > I
> > > would like to keep it in the logs.
> > > 
> > > Please let me know if I am doing something wrong or if there is a
> > > way
> > > to turn off these messages.
> > > 
> > > Thanks,
> > > 
> > > Jean-Baptiste Skutnik
> > > ___
> > 
> > -- 
> > Ken Gaillot 
> > 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] RemoteOFFLINE status, permanently

2023-12-04 Thread Ken Gaillot

rom the membership cache
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ /cib/configuration/resources:   class="ocf" id="lustre1" provider="pacemaker" type="remote"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++
>  
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ id="lustre1-instance_attributes-server" name="server"
> value="lustre1"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ id="lustre1-migrate_from-interval-0s" interval="0s"
> name="migrate_from" timeout="60s"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ id="lustre1-migrate_to-interval-0s" interval="0s" name="migrate_to"
> timeout="60s"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ id="lustre1-monitor-interval-60s" interval="60s" name="monitor"
> timeout="30s"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ id="lustre1-reload-interval-0s" interval="0s" name="reload"
> timeout="60s"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ id="lustre1-reload-agent-interval-0s" interval="0s" name="reload-
> agent" timeout="60s"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ id="lustre1-start-interval-0s" interval="0s" name="start"
> timeout="60s"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++ id="lustre1-stop-interval-0s" interval="0s" name="stop"
> timeout="60s"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-execd [2483]
> (process_lrmd_get_rsc_info) info: Agent information for 'lustre1'
> not in cache
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-controld  [2486]
> (do_lrm_rsc_op) notice: Requesting local execution of probe
> operation for lustre1 on lustre-mgs | transition_key=5:88:7:288b2e10-
> 0bee-498d-b9eb-4bc5f0f8d5bf op_key=lustre1_monitor_0
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-controld  [2486]
> (log_executor_event)notice: Result of probe operation for lustre1
> on lustre-mgs: not running (Remote connection inactive) | graph
> action confirmed; call=7 key=lustre1_monitor_0 rc=7
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++
> /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources:
>   type="remote"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++
> operation_key="lustre1_monitor_0" operation="monitor" crm-debug-
> origin="controld_update_resource_history" crm_feature_set="3.17.4"
> transition-key="3:88:7:288b2e10-0bee-498d-b9eb-4bc5f0f8d5bf"
> transition-magic="-1:193;3:88:7:288b2e10-0bee-498d-b9eb-4bc5f0f8d5bf" 
> exit-reason="" on_node="lustre-mds1" call-id="-1" rc-code="193" op-st
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: +
>  /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resou
> rce[@id='lustre1']/lrm_rsc_op[@id='lustre1_last_0']:  @transition-
> magic=0:7;3:88:7:288b2e10-0bee-498d-b9eb-4bc5f0f8d5bf, @call-id=7,
> @rc-code=7, @op-status=0
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++
> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources:
>   type="remote"/>
> Nov 29 12:50:11 lustre-mgs.ntslab.ru pacemaker-based [2481]
> (log_info)  info: ++
> operation_key="lustre1_monitor_0" operation="monitor" crm-debug-
> origin="controld_update_resource_history" crm_feature_set="3.17.4"
> transition-key="5:88:7:28

Re: [ClusterLabs] Redundant entries in log

2023-11-29 Thread Ken Gaillot

Hi,

Something is triggering a new transition. The most likely candidate is
a low value for cluster-recheck-interval.

Many years ago, a low cluster-recheck-interval was necessary to make
certain things like failure-timeout more timely, but that has not been
the case in a long time. It should be left to default (15 minutes) in
the vast majority of cases. (A new transition will still occur on that
schedule, but that's reasonable.)

On Wed, 2023-11-29 at 10:05 +, Jean-Baptiste Skutnik via Users
wrote:
> Hello all,
> 
> I am managing a cluster using pacemaker for high availability. I am
> parsing the logs for relevant information on the cluster health and
> the logs are full of the following:
> 
> ```
> Nov 29 09:17:41 esvm2 pacemaker-controld[2893]:  notice: State
> transition S_IDLE -> S_POLICY_ENGINE
> Nov 29 09:17:41 esvm2 pacemaker-schedulerd[2892]:  notice: Calculated
> transition 8629, saving inputs in /var/lib/pacemaker/pengine/pe-
> input-250.bz2
> Nov 29 09:17:41 esvm2 pacemaker-controld[2893]:  notice: Transition
> 8629 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-250.bz2): Complete
> Nov 29 09:17:41 esvm2 pacemaker-controld[2893]:  notice: State
> transition S_TRANSITION_ENGINE -> S_IDLE
> Nov 29 09:18:41 esvm2 pacemaker-controld[2893]:  notice: State
> transition S_IDLE -> S_POLICY_ENGINE
> Nov 29 09:18:41 esvm2 pacemaker-schedulerd[2892]:  notice: Calculated
> transition 8630, saving inputs in /var/lib/pacemaker/pengine/pe-
> input-250.bz2
> Nov 29 09:18:41 esvm2 pacemaker-controld[2893]:  notice: Transition
> 8630 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-250.bz2): Complete
> Nov 29 09:18:41 esvm2 pacemaker-controld[2893]:  notice: State
> transition S_TRANSITION_ENGINE -> S_IDLE
> Nov 29 09:19:41 esvm2 pacemaker-controld[2893]:  notice: State
> transition S_IDLE -> S_POLICY_ENGINE
> Nov 29 09:19:41 esvm2 pacemaker-schedulerd[2892]:  notice: Calculated
> transition 8631, saving inputs in /var/lib/pacemaker/pengine/pe-
> input-250.bz2
> Nov 29 09:19:41 esvm2 pacemaker-controld[2893]:  notice: Transition
> 8631 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-250.bz2): Complete
> Nov 29 09:19:41 esvm2 pacemaker-controld[2893]:  notice: State
> transition
> ...
> ```
> 
> The transition IDs seem to differ however the file containing the
> transition data stays the same, implying that the transition does not
> affect the cluster. (/var/lib/pacemaker/pengine/pe-input-250.bz2)
> 
> I noticed the option to restrict the logging to higher levels however
> some valuable information is logged under the `notice` level and I
> would like to keep it in the logs.
> 
> Please let me know if I am doing something wrong or if there is a way
> to turn off these messages.
> 
> Thanks,
> 
> Jean-Baptiste Skutnik
> ___

-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.7-rc1 now available

2023-10-31 Thread Ken Gaillot

Hi all,

Source code for the first release candidate for Pacemaker version 2.1.7
is available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.7-rc1

This is primarily a bug fix release. See the ChangeLog or the link
above for details.

Everyone is encouraged to download, build, and test the new release. We
do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all contributors of source code to this release,
including Chris Lumens, Gao,Yan, Grace Chin, Hideo Yamauchi, Jan
Pokorný, Ken Gaillot, liupei, Oyvind Albrigtsen, Reid Wahl, and
xuezhixin.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] How to output debug messages in the log file?

2023-10-03 Thread Ken Gaillot

On Tue, 2023-10-03 at 18:19 +0800, Jack via Users wrote:
> I wrote a resource file Stateful1 in /lib/ocf/resources/pacemaker on
> Ubuntu 22.04. It didn't working. So I wrote  ocf_log debug "hello
> world"  in the file Stateful1. But it didn't output debug messages.
> How can I output debug messages?
> 

Hi,

Set PCMK_debug=true wherever your distro keeps environment variables
for daemons (/etc/sysconfig/pacemaker, /etc/default/pacemaker, etc.).

Debug messages will show up in the Pacemaker detail log (typically
/var/log/pacemaker/pacemaker.log).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Mutually exclusive resources ?

2023-09-27 Thread Ken Gaillot

On Wed, 2023-09-27 at 16:24 +0200, Adam Cecile wrote:
> On 9/27/23 16:02, Ken Gaillot wrote:
> > On Wed, 2023-09-27 at 15:42 +0300, Andrei Borzenkov wrote:
> > > On Wed, Sep 27, 2023 at 3:21 PM Adam Cecile 
> > > wrote:
> > > > Hello,
> > > > 
> > > > 
> > > > I'm struggling to understand if it's possible to create some
> > > > kind
> > > > of constraint to avoid two different resources to be running on
> > > > the
> > > > same host.
> > > > 
> > > > Basically, I'd like to have floating IP "1" and floating IP "2"
> > > > always being assigned to DIFFERENT nodes.
> > > > 
> > > > Is that something possible ?
> > > 
> > > Sure, negative colocation constraint.
> > > 
> > > > Can you give me a hint ?
> > > > 
> > > 
> > > Using crmsh:
> > > 
> > > colcoation IP1-no-with-IP2 -inf: IP1 IP2
> > > 
> > > > Thanks in advance, Adam.
> > 
> > To elaborate, use -INFINITY if you want the IPs to *never* run on
> > the
> > same node, even if there are no other nodes available (meaning one
> > of
> > them has to stop). If you *prefer* that they run on different
> > nodes,
> > but want to allow them to run on the same node in a degraded
> > cluster,
> > use a finite negative score.
> 
> That's exactly what I tried to do:
> crm configure primitive Freeradius systemd:freeradius.service op
> start interval=0 timeout=120 op stop interval=0 timeout=120 op
> monitor interval=60 timeout=100
> crm configure clone Clone-Freeradius Freeradius
> 
> crm configure primitive Shared-IPv4-Cisco-ISE-1 IPaddr2 params
> ip=10.1.1.1 nic=eth0 cidr_netmask=24 meta migration-threshold=2 op
> monitor interval=60 timeout=30 resource-stickiness=50
> crm configure primitive Shared-IPv4-Cisco-ISE-2 IPaddr2 params
> ip=10.1.1.2 nic=eth0 cidr_netmask=24 meta migration-threshold=2 op
> monitor interval=60 timeout=30 resource-stickiness=50
> 
> crm configure location Shared-IPv4-Cisco-ISE-1-Prefer-BRT Shared-
> IPv4-Cisco-ISE-1 50: infra-brt
> crm configure location Shared-IPv4-Cisco-ISE-2-Prefer-BTZ Shared-
> IPv4-Cisco-ISE-2 50: infra-btz
> crm configure colocation Shared-IPv4-Cisco-ISE-Different-Nodes -100:
> Shared-IPv4-Cisco-ISE-1 Shared-IPv4-Cisco-ISE-2
> My hope is that IP1 stays in infra-brt and IP2 goes on infra-btz. I
> want to allow them to keep running on different host so I also added
> stickiness. However, I really do not want them to both run on same
> node so I added a colocation with negative higher score.
> Does it looks good to you ?

Yep, that should work.

The way you have it, if there's some sort of problem and both IPs end
up on the same node, the IP that doesn't prefer that node will move
back to its preferred node once the problem is resolved. That sounds
like what you want, but if you'd rather it not move, you could raise
stickiness above 100.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Mutually exclusive resources ?

2023-09-27 Thread Ken Gaillot

On Wed, 2023-09-27 at 15:42 +0300, Andrei Borzenkov wrote:
> On Wed, Sep 27, 2023 at 3:21 PM Adam Cecile 
> wrote:
> > Hello,
> > 
> > 
> > I'm struggling to understand if it's possible to create some kind
> > of constraint to avoid two different resources to be running on the
> > same host.
> > 
> > Basically, I'd like to have floating IP "1" and floating IP "2"
> > always being assigned to DIFFERENT nodes.
> > 
> > Is that something possible ?
> 
> Sure, negative colocation constraint.
> 
> > Can you give me a hint ?
> > 
> 
> Using crmsh:
> 
> colcoation IP1-no-with-IP2 -inf: IP1 IP2
> 
> > Thanks in advance, Adam.

To elaborate, use -INFINITY if you want the IPs to *never* run on the
same node, even if there are no other nodes available (meaning one of
them has to stop). If you *prefer* that they run on different nodes,
but want to allow them to run on the same node in a degraded cluster,
use a finite negative score.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] pacemaker-remote

2023-09-18 Thread Ken Gaillot

On Thu, 2023-09-14 at 18:28 +0800, Mr.R via Users wrote:
> Hi all，
>
> In Pacemaker-Remote 2.1.6, the pacemaker package is required
> for guest nodes and not for remote nodes. Why is that? What does 
> pacemaker do?
> After adding guest node, pacemaker package does not seem to be 
> needed. Can I not install it here?

I'm not sure what's requiring it in your environment. There's no
dependency in the upstream RPM at least.

The pacemaker package does have the crm_master script needed by some
resource agents, so you will need it if you use any of those. (That
script should have been moved to the pacemaker-cli package in 2.1.3,
oops ...)

> After testing, remote nodes can be offline, but guest nodes cannot
>  be offline. Is there any way to get them offline? Are there
> relevant 
> failure test cases?
> 
> thanks,

To make a guest node offline, stop the resource that creates it.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Limit the number of resources starting/stoping in parallel possible?

2023-09-18 Thread Ken Gaillot

On Mon, 2023-09-18 at 14:24 +, Knauf Steffen wrote:
> Hi,
> 
> we have multiple Cluster (2 node + quorum setup) with more then 100
> Resources ( 10 x VIP + 90 Microservices) per Node.  
> If the Resources are stopped/started at the same time the Server is
> under heavy load, which may result into timeouts and an unresponsive
> server. 
> We configured some Ordering Constraints (VIP --> Microservice). Is
> there a way to limit the number of resources starting/stoping in
> parallel?
> Perhaps you have some other tips to handle such a situation.
> 
> Thanks & greets
> 
> Steffen
> 

Hi,

Yes, see the batch-limit cluster option:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/options.html#cluster-options

-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] [EXTERNE] Re: Centreon HA Cluster - VIP issue

2023-09-18 Thread Ken Gaillot

On Fri, 2023-09-15 at 09:32 +, Adil BOUAZZAOUI wrote:
> Hi Ken,
> 
> Any update please?
> 
> The idea is clear; I just need to know more information about this 2
> clusters setup:
> 
> 1. Arbitrator:
> 1.1. Only one arbitrator is needed for everything: should I use the
> Quorum provided by Centreon on the official documentation? Or should
> I use the booth ticket manager instead?

I would use booth for distributed data centers. The Centreon setup is
appropriate for a cluster within a single data center or data centers
on the same campus with a low-latency link.

> 1.2. is fencing configured separately? Or is is configured during the
> booth ticket manager installation?

You'll have to configure fencing in each cluster separately.

> 
> 2. Floating IP:
> 2.1. it doesn't hurt if both Floating IPs are running at the same
> time right?

Correct.

> 
> 3. Fail over:
> 3.1. How to update the DNS to point to the appropriate IP?
> 3.2. we're running our own DNS servers; so How to configure booth
> ticket for just the DNS resource?

You can have more than one ticket. On the Pacemaker side, tickets are
tied to resources with rsc_ticket constraints (though you'll probably
be using a higher-level tool that abstracts that).

How to update the DNS depends on what server you're using -- just
follow its documentation for making changes. You can use the
ocf:pacemaker:Dummy agent as a model and update start to make the DNS
change (in addition to creating the dummy state file). The monitor can
check whether the dummy state file is present and DNS is returning the
desired info. Stop would just remove the dummy state file.

> 4. MariaDB replication:
> 4.1. How can Centreon MariaDB replicat between the 2 clusters?

Native MySQL replication should work fine for that.

> 5. Centreon:
> 5.1. Will this setup (2 clusters, 2 floating IPs, 1 booth manager)
> work for our Centreon project? 

I don't have any experience with that, but it sounds fine.

> 
> 
> 
> Regards
> Adil Bouazzaoui
> 
> 
> Adil BOUAZZAOUI
> Ingénieur Infrastructures & Technologies
> GSM : +212 703 165 758
> E-mail  : adil.bouazza...@tmandis.ma
> 
> 
> -Message d'origine-
> De : Adil BOUAZZAOUI 
> Envoyé : Friday, September 8, 2023 5:15 PM
> À : Ken Gaillot ; Adil Bouazzaoui <
> adilb...@gmail.com>
> Cc : Cluster Labs - All topics related to open-source clustering
> welcomed 
> Objet : RE: [EXTERNE] Re: [ClusterLabs] Centreon HA Cluster - VIP
> issue
> 
> Hi Ken,
> 
> Thank you for the update and the clarification.
> The idea is clear; I just need to know more information about this 2
> clusters setup:
> 
> 1. Arbitrator:
> 1.1. Only one arbitrator is needed for everything: should I use the
> Quorum provided by Centreon on the official documentation? Or should
> I use the booth ticket manager instead?
> 1.2. is fencing configured separately? Or is is configured during the
> booth ticket manager installation?
> 
> 2. Floating IP:
> 2.1. it doesn't hurt if both Floating IPs are running at the same
> time right?
> 
> 3. Fail over:
> 3.1. How to update the DNS to point to the appropriate IP?
> 3.2. we're running our own DNS servers; so How to configure booth
> ticket for just the DNS resource?
> 
> 4. MariaDB replication:
> 4.1. How can Centreon MariaDB replicat between the 2 clusters?
> 
> 5. Centreon:
> 5.1. Will this setup (2 clusters, 2 floating IPs, 1 booth manager)
> work for our Centreon project? 
> 
> 
> 
> Regards
> Adil Bouazzaoui
> 
> 
> Adil BOUAZZAOUI
> Ingénieur Infrastructures & Technologies GSM : +212 703 165
> 758 E-mail  : adil.bouazza...@tmandis.ma
> 
> 
> -Message d'origine-
> De : Ken Gaillot [mailto:kgail...@redhat.com] Envoyé : Tuesday,
> September 5, 2023 10:00 PM À : Adil Bouazzaoui 
> Cc : Cluster Labs - All topics related to open-source clustering
> welcomed ; Adil BOUAZZAOUI <
> adil.bouazza...@tmandis.ma> Objet : [EXTERNE] Re: [ClusterLabs]
> Centreon HA Cluster - VIP issue
> 
> On Tue, 2023-09-05 at 21:13 +0100, Adil Bouazzaoui wrote:
> > Hi Ken,
> > 
> > thank you a big time for the feedback; much appreciated.
> > 
> > I suppose we go with a new Scenario 3: Setup 2 Clusters across 
> > different DCs connected by booth; so could you please clarify
> > below 
> > points to me so i can understand better and start working on the
> > architecture:
> > 
> > 1- in case of separate clusters connected by booth: should each 
> > cluster have a quorum device for the Master/slave elections?
> 
> Hi,
> 
> Only one arbitrator is needed for everything.
> 
> Since each cluster in this case has two nodes, Corosync wi

Re: [ClusterLabs] PostgreSQL HA on EL9

2023-09-18 Thread Ken Gaillot

Ah, good catch. FYI, we created a hook for situations like this a while
back: resource-agents-deps.target. Which reminds me we really need to
document it ...

To use it, put a drop-in unit under /etc/systemd/system/resource-
agents-deps.target.d/ (any name ending in .conf) with:

  [Unit]
  Requires=
  After=

Pacemaker is ordered after resource-agents-deps, so you can use it to
start any non-clustered depedencies.

On Thu, 2023-09-14 at 15:43 +, Larry G. Mills via Users wrote:
> I found my issue with reboots - and it wasn't pacemaker-related at
> all.  My EL9 test system was different from the EL7 system in that it
> hosted the DB on a iSCSI-attached array.  During OS shutdown, the
> array was being unmounted concurrently with pacemaker shutdown, so it
> was not able to cleanly shut down the pgsql resource. I added a
> systemd override to make corosync dependent upon, and require,
> "remote-fs.target".   Everything shuts down cleanly now, as expected.
> 
> Thanks for the suggestions,
> 
> Larry
> 
> > -Original Message-
> > From: Users  On Behalf Of Oyvind
> > Albrigtsen
> > Sent: Thursday, September 14, 2023 5:43 AM
> > To: Cluster Labs - All topics related to open-source clustering
> > welcomed
> > 
> > Subject: Re: [ClusterLabs] PostgreSQL HA on EL9
> > 
> > If you're using network filesystems with the Filesystem agent this
> > patch might solve your issue:
> > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__github.com_ClusterLabs_resource-
> > 2Dagents_pull_1869=DwICAg=gRgGjJ3BkIsb5y6s49QqsA=-
> > 46XreMySVoZzxM8t8YcpIX4ayXVWYLvAe0EnGHidNE=VO4147YbENDjp3d
> > xoJeWclZ_EfLrehCht5CgW4_stkgPmryQN0kBA6G12wBwYztD=vEhk79BWO
> > NaF8zrTI3oGbq7xqEYdQUICm-2H3Wal0J8=
> > 
> > 
> > Oyvind
> > 
> > On 13/09/23 17:56 +, Larry G. Mills via Users wrote:
> > > > On my RHEL 9 test cluster, both "reboot" and "systemctl reboot"
> > > > wait
> > > > for the cluster to stop everything.
> > > > 
> > > > I think in some environments "reboot" is equivalent to
> > > > "systemctl
> > > > reboot --force" (kill all processes immediately), so maybe see
> > > > if
> > > > "systemctl reboot" is better.
> > > > 
> > > > > On EL7, this scenario caused the cluster to shut itself down
> > > > > on the
> > > > > node before the OS shutdown completed, and the DB resource
> > > > > was
> > > > > stopped/shutdown before the OS stopped.  On EL9, this is not
> > > > > the
> > > > > case, the DB resource is not stopped before the OS shutdown
> > > > > completes.  This leads to errors being thrown when the
> > > > > cluster is
> > > > > started back up on the rebooted node similar to the
> > > > > following:
> > > > > 
> > > 
> > > Ken,
> > > 
> > > Thanks for the reply - and that's interesting that RHEL9 behaves
> > > as expected
> > and AL9 seemingly doesn't.   I did try shutting down via "systemctl
> > reboot",
> > but the cluster and resources were still not stopped cleanly before
> > the OS
> > stopped.  In fact, the commands "shutdown" and "reboot" are just
> > symlinks
> > to systemctl on AL9.2, so that make sense why the behavior is the
> > same.
> > > Just as a point of reference, my systemd version is:
> > > systemd.x86_64
> > 252-14.el9_2.3
> > > Larry
> > > ___
> > > Manage your subscription:
> > > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__lists.clusterlabs.org_mailman_listinfo_users=DwICAg=gRgGjJ3
> > BkIsb
> > 5y6s49QqsA=-
> > 46XreMySVoZzxM8t8YcpIX4ayXVWYLvAe0EnGHidNE=VO4147YbENDjp3d
> > xoJeWclZ_EfLrehCht5CgW4_stkgPmryQN0kBA6G12wBwYztD=2Rx_74MVv
> > kAWfZLyMhZw5GCY_37uyRffB2HV4_zkvOY=
> > > ClusterLabs home: 
> > > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__www.clusterlabs.org_=DwICAg=gRgGjJ3BkIsb5y6s49QqsA=-
> > 46XreMySVoZzxM8t8YcpIX4ayXVWYLvAe0EnGHidNE=VO4147YbENDjp3d
> > xoJeWclZ_EfLrehCht5CgW4_stkgPmryQN0kBA6G12wBwYztD=lofFF14IrTG
> > 21epUbKbV0oUl-IrXZDSuNcaM1GM7FvU=
> > 
> > ___
> > Manage your subscription:
> > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__lists.clusterlabs.org_mailman_listinfo_users=DwICAg=gRgGjJ3
> > BkIsb
> > 5y6s49QqsA=

Re: [ClusterLabs] PostgreSQL HA on EL9

2023-09-13 Thread Ken Gaillot

On Wed, 2023-09-13 at 16:45 +, Larry G. Mills via Users wrote:
> Hello Pacemaker community,
>  
> I have several two-node postgres 14 clusters that I am migrating from
> EL7 (Scientific Linux 7) to EL9 (AlmaLinux 9.2).
>  
> My configuration:
>  
> Cluster size: two nodes
> Postgres version: 14
> Corosync version: 3.1.7-1.el9  
> Pacemaker version: 2.1.5-9.el9_2
> pcs version: 0.11.4-7.el9_2
>  
> The migration has mostly gone smoothly, but I did notice one non-
> trivial change in recovery behavior between EL7 and EL9.  The
> recovery scenario is:
>  
> With the cluster running normally with one primary DB (i.e. Promoted)
> and one standby (i.e. Unpromoted), reboot one of the cluster nodes
> without first shutting down the cluster on that node.  The reboot is
> a “clean” system shutdown done via either the “reboot” or “shutdown”
> OS commands.

On my RHEL 9 test cluster, both "reboot" and "systemctl reboot" wait
for the cluster to stop everything.

I think in some environments "reboot" is equivalent to "systemctl
reboot --force" (kill all processes immediately), so maybe see if
"systemctl reboot" is better.

>  
> On EL7, this scenario caused the cluster to shut itself down on the
> node before the OS shutdown completed, and the DB resource was
> stopped/shutdown before the OS stopped.  On EL9, this is not the
> case, the DB resource is not stopped before the OS shutdown
> completes.  This leads to errors being thrown when the cluster is
> started back up on the rebooted node similar to the following:
> 
>   * pgsql probe on mynode returned 'error' (Instance "pgsql"
> controldata indicates a running secondary instance, the instance has
> probably crashed)
>  
> While this is not too serious for a standby DB instance, as the
> cluster is able to recover it back to the standby/Unpromoted state,
> if you reboot the Primary/Promoted DB node, the cluster is not able
> to recover it (because that DB still thinks it’s a primary), and the
> node is fenced.
>  
> Is this an intended behavior for the versions of pacemaker/corosync
> that I’m running, or a regression?   It may be possible to put an
> override into the systemd unit file for corosync to force the cluster
> to shutdown before the OS stops, but I’d rather not do that if
> there’s a better way to handle this recovery scenario.
>  
> Thanks for any advice,
>  
> Larry
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-13 Thread Ken Gaillot

On Tue, 2023-09-12 at 10:28 +0200, Damiano Giuliani wrote:
> thanks Ken,
> 
> could you point me in th right direction for a guide or some already
> working configuration?
> 
> Thanks
> 
> Damiano

Nothing specific to galera, just the usual Pacemaker Explained
documentation about clones.

There are some regression tests in the code base that include galera
resources. Some use clones and others bundles (containerized). For
example:

https://github.com/ClusterLabs/pacemaker/blob/main/cts/scheduler/xml/unrunnable-2.xml


> 
> Il giorno lun 11 set 2023 alle ore 16:26 Ken Gaillot <
> kgail...@redhat.com> ha scritto:
> > On Thu, 2023-09-07 at 10:27 +0100, Antony Stone wrote:
> > > On Wednesday 06 September 2023 at 17:01:24, Damiano Giuliani
> > wrote:
> > > 
> > > > Everything is clear now.
> > > > So the point is to use pacemaker and create the floating vip
> > and
> > > > bind it to
> > > > sqlproxy to health check and route the traffic to the available
> > and
> > > > healthy
> > > > galera nodes.
> > > 
> > > Good summary.
> > > 
> > > > It could be useful let pacemaker manage also galera services?
> > > 
> > > No; MySQL / Galera needs to be running on all nodes all the
> > > time.  Pacemaker 
> > > is for managing resources which move between nodes.
> > 
> > It's still helpful to configure galera as a clone in the cluster.
> > That
> > way, Pacemaker can monitor it and restart it on errors, it will
> > respect
> > things like maintenance mode and standby, and it can be used in
> > ordering constraints with other resources, as well as advanced
> > features
> > such as node utilization.
> > 
> > > 
> > > If you want something that ensures processes are running on
> > > machines, 
> > > irrespective of where the floating IP is, look at monit - it's
> > very
> > > simple, 
> > > easy to configure and knows how to manage resources which should
> > run
> > > all the 
> > > time.
> > > 
> > > > Do you have any guide that pack this everything together?
> > > 
> > > No; I've largely made this stuff up myself as I've needed it.
> > > 
> > > 
> > > Antony.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] MySQL cluster with auto failover

2023-09-11 Thread Ken Gaillot

On Thu, 2023-09-07 at 10:27 +0100, Antony Stone wrote:
> On Wednesday 06 September 2023 at 17:01:24, Damiano Giuliani wrote:
> 
> > Everything is clear now.
> > So the point is to use pacemaker and create the floating vip and
> > bind it to
> > sqlproxy to health check and route the traffic to the available and
> > healthy
> > galera nodes.
> 
> Good summary.
> 
> > It could be useful let pacemaker manage also galera services?
> 
> No; MySQL / Galera needs to be running on all nodes all the
> time.  Pacemaker 
> is for managing resources which move between nodes.

It's still helpful to configure galera as a clone in the cluster. That
way, Pacemaker can monitor it and restart it on errors, it will respect
things like maintenance mode and standby, and it can be used in
ordering constraints with other resources, as well as advanced features
such as node utilization.

> 
> If you want something that ensures processes are running on
> machines, 
> irrespective of where the floating IP is, look at monit - it's very
> simple, 
> easy to configure and knows how to manage resources which should run
> all the 
> time.
> 
> > Do you have any guide that pack this everything together?
> 
> No; I've largely made this stuff up myself as I've needed it.
> 
> 
> Antony.
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Centreon HA Cluster - VIP issue

2023-09-05 Thread Ken Gaillot

On Tue, 2023-09-05 at 21:13 +0100, Adil Bouazzaoui wrote:
> Hi Ken,
> 
> thank you a big time for the feedback; much appreciated.
> 
> I suppose we go with a new Scenario 3: Setup 2 Clusters across
> different DCs connected by booth; so could you please clarify below
> points to me so i can understand better and start working on the
> architecture:
> 
> 1- in case of separate clusters connected by booth: should each
> cluster have a quorum device for the Master/slave elections?

Hi,

Only one arbitrator is needed for everything.

Since each cluster in this case has two nodes, Corosync will use the
"two_node" configuration to determine quorum. When first starting the
cluster, both nodes must come up before quorum is obtained. After then,
only one node is required to keep quorum -- which means that fencing is
essential to prevent split-brain.

> 2- separate floating IPs at each cluster: please check the attached
> diagram and let me know if this is exactly what you mean?

Yes, that looks good

> 3- To fail over, you update the DNS to point to the appropriate IP:
> can you suggest any guide to work on so we can have the DNS updated
> automatically?

Unfortunately I don't know of any. If your DNS provider offers an API
of some kind, you can write a resource agent that uses it. If you're
running your own DNS servers, the agent has to update the zone files
appropriately and reload.

Depending on what your services are, it might be sufficient to use a
booth ticket for just the DNS resource, and let everything else stay
running all the time. For example it doesn't hurt anything for both
sites' floating IPs to stay up.

> Regards
> Adil Bouazzaoui
> 
> Le mar. 5 sept. 2023 à 16:48, Ken Gaillot  a
> écrit :
> > Hi,
> > 
> > The scenario you describe is still a challenging one for HA.
> > 
> > A single cluster requires low latency and reliable communication. A
> > cluster within a single data center or spanning data centers on the
> > same campus can be reliable (and appears to be what Centreon has in
> > mind), but it sounds like you're looking for geographical
> > redundancy.
> > 
> > A single cluster isn't appropriate for that. Instead, separate
> > clusters
> > connected by booth would be preferable. Each cluster would have its
> > own
> > nodes and fencing. Booth tickets would control which cluster could
> > run
> > resources.
> > 
> > Whatever design you use, it is pointless to put a quorum tie-
> > breaker at
> > one of the data centers. If that data center becomes unreachable,
> > the
> > other one can't recover resources. The tie-breaker (qdevice for a
> > single cluster or a booth arbitrator for multiple clusters) can be
> > very
> > lightweight, so it can run in a public cloud for example, if a
> > third
> > site is not available.
> > 
> > The IP issue is separate. For that, you will need separate floating
> > IPs
> > at each cluster, on that cluster's network. To fail over, you
> > update
> > the DNS to point to the appropriate IP. That is a tricky problem
> > without a universal automated solution. Some people update the DNS
> > manually after being alerted of a failover. You could write a
> > custom
> > resource agent to update the DNS automatically. Either way you'll
> > need
> > low TTLs on the relevant records.
> > 
> > On Sun, 2023-09-03 at 11:59 +, Adil BOUAZZAOUI wrote:
> > > Hello,
> > >  
> > > My name is Adil, I’m working for Tman company, we are testing the
> > > Centreon HA cluster to monitor our infrastructure for 13
> > companies,
> > > for now we are using the 100 IT license to test the platform,
> > once
> > > everything is working fine then we can purchase a license
> > suitable
> > > for our case.
> > >  
> > > We're stuck at scenario 2: setting up Centreon HA Cluster with
> > Master
> > > & Slave on a different datacenters.
> > > For scenario 1: setting up the Cluster with Master & Slave and
> > VIP
> > > address on the same network (VLAN) it is working fine.
> > >  
> > > Scenario 1: Cluster on Same network (same DC) ==> works fine
> > > Master in DC 1 VLAN 1: 172.30.9.230 /24
> > > Slave in DC 1 VLAN 1: 172.30.9.231 /24
> > > VIP in DC 1 VLAN 1: 172.30.9.240/24
> > > Quorum in DC 1 LAN: 192.168.253.230/24
> > > Poller in DC 1 LAN: 192.168.253.231/24
> > >  
> > > Scenario 2: Cluster on different networks (2 separate DCs
> > connected
> > > with VPN) ==> still not working
> > > Master in DC 1 VLAN 1: 172.30.9

Re: [ClusterLabs] Centreon HA Cluster - VIP issue

2023-09-05 Thread Ken Gaillot

Hi,

The scenario you describe is still a challenging one for HA.

A single cluster requires low latency and reliable communication. A
cluster within a single data center or spanning data centers on the
same campus can be reliable (and appears to be what Centreon has in
mind), but it sounds like you're looking for geographical redundancy.

A single cluster isn't appropriate for that. Instead, separate clusters
connected by booth would be preferable. Each cluster would have its own
nodes and fencing. Booth tickets would control which cluster could run
resources.

Whatever design you use, it is pointless to put a quorum tie-breaker at
one of the data centers. If that data center becomes unreachable, the
other one can't recover resources. The tie-breaker (qdevice for a
single cluster or a booth arbitrator for multiple clusters) can be very
lightweight, so it can run in a public cloud for example, if a third
site is not available.

The IP issue is separate. For that, you will need separate floating IPs
at each cluster, on that cluster's network. To fail over, you update
the DNS to point to the appropriate IP. That is a tricky problem
without a universal automated solution. Some people update the DNS
manually after being alerted of a failover. You could write a custom
resource agent to update the DNS automatically. Either way you'll need
low TTLs on the relevant records.

On Sun, 2023-09-03 at 11:59 +, Adil BOUAZZAOUI wrote:
> Hello,
>  
> My name is Adil, I’m working for Tman company, we are testing the
> Centreon HA cluster to monitor our infrastructure for 13 companies,
> for now we are using the 100 IT license to test the platform, once
> everything is working fine then we can purchase a license suitable
> for our case.
>  
> We're stuck at scenario 2: setting up Centreon HA Cluster with Master
> & Slave on a different datacenters.
> For scenario 1: setting up the Cluster with Master & Slave and VIP
> address on the same network (VLAN) it is working fine.
>  
> Scenario 1: Cluster on Same network (same DC) ==> works fine
> Master in DC 1 VLAN 1: 172.30.9.230 /24
> Slave in DC 1 VLAN 1: 172.30.9.231 /24
> VIP in DC 1 VLAN 1: 172.30.9.240/24
> Quorum in DC 1 LAN: 192.168.253.230/24
> Poller in DC 1 LAN: 192.168.253.231/24
>  
> Scenario 2: Cluster on different networks (2 separate DCs connected
> with VPN) ==> still not working
> Master in DC 1 VLAN 1: 172.30.9.230 /24
> Slave in DC 2 VLAN 2: 172.30.10.230 /24
> VIP: example 102.84.30.XXX. We used a public static IP from our
> internet service provider, we thought that using a IP from a site
> network won't work, if the site goes down then the VIP won't be
> reachable!
> Quorum: 192.168.253.230/24
> Poller: 192.168.253.231/24
>  
>  
> Our goal is to have Master & Slave nodes on different sites, so when
> Site A goes down, we keep monitoring with the slave.
> The problem is that we don't know how to set up the VIP address? Nor
> what kind of VIP address will work? or how can the VIP address work
> in this scenario? or is there anything else that can replace the VIP
> address to make things work.
> Also, can we use a backup poller? so if the poller 1 on Site A goes
> down, then the poller 2 on Site B can take the lead?
>  
> we looked everywhere (The watch, youtube, Reddit, Github...), and we
> still couldn't get a workaround!
>  
> the guide we used to deploy the 2 Nodes Cluster: 
> https://docs.centreon.com/docs/installation/installation-of-centreon-ha/overview/
>  
> attached the 2 DCs architecture example, and also most of the
> required screenshots/config.
>  
>  
> We appreciate your support.
> Thank you in advance.
>  
>  
>  
> Regards
> Adil Bouazzaoui
>  
>Adil BOUAZZAOUI Ingénieur Infrastructures & Technologies   
>  GSM : +212 703 165 758 E-mail  : adil.bouazza...@tmandis.ma 
>  
>  
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] corosync 2.4 and 3.0 in one cluster.

2023-09-05 Thread Ken Gaillot

On Fri, 2023-09-01 at 17:56 +0300, Мельник Антон wrote:
> Hello,
> 
> I have a cluster with two nodes with corosync version 2.4 installed
> there.
> I need to upgrade to corosync version 3.0 without shutting down the
> cluster.

Hi,

It's not possible for Corosync 2 and 3 nodes to form a cluster. They're
"wire-incompatible".

> I thought to do it in this way:
> 1. Stop HA on the first node, do upgrade to newer version of Linux
> with upgrade corosync, change corosync config.
> 2. Start upgraded node and migrate resources there.
> 3. Do upgrade on the second node.

You could still do something similar if you use two separate clusters.
You'd remove the first node from the cluster configuration (Corosync
and Pacemaker) before shutting it down, and create a new cluster on it
after upgrading. The new cluster would have itself as the only node,
and all resources would be disabled, but otherwise it would be
identical. Then you could manually migrate resources by disabling them
on the second node and enabling them on the first.

You could even automate it using booth. To migrate the resources, you'd
just have to reassign the ticket.

> Currently on version 2.4 corosync is configured with udpu transport
> and crypto_hash set to sha256.
> As far as I know version 3.0 does not support udpu with configured
> options crypto_hash and crypto_cipher.
> The question is how to allow communication between corosync instances
> with version 2 and 3, if corosync version 2 is configured with
> crypto_hash sha256.
> 
> 
> Thanks,
> Anton.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Coming in Pacemaker 2.1.7: Pacemaker Remote nodes honor PCMK_node_start_state

2023-08-28 Thread Ken Gaillot

Hi all,

The Pacemaker 2.1.7 release, expected in a couple of months, will
primarily be a bug fix release.

One new feature is that the PCMK_node_start_state start-up variable
(set in /etc/sysconfig, /etc/default, etc.) will support Pacemaker
Remote nodes. Previously, it was supported only for full cluster nodes.
It lets you tell the cluster that a new node should start in standby
mode when it is added.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Asking about Clusterlabs cross DC cluster

2023-08-21 Thread Ken Gaillot

Hi,

Yes, it's fine as long as the latency is reasonably low, and you have
some form of fencing that can work if the link between them is lost. 

sbd is a good choice for fencing if each node has a hardware watchdog.
With two nodes, you really need a third site with either shared storage
(for disk-based sbd) or qdevice (to give true quorum to diskless sbd).

If you have redundant network paths between the data centers, you could
use a traditional fence device instead (such as a smart power strip).
You have to be careful to avoid single points of failure in this design
(for example, if both cables run through the same conduit).

On Thu, 2023-08-17 at 20:55 +0100, Adil Bouazzaoui wrote:
> Hello,
> 
> my name is adil, i got your email from Clusterlabs site.
> i'm wondering if i can setup a cluster with Corosync/pacemaker for a
> cross DC cluster?
> 
> for example:
> Node 1 (Master) in VLAN 1: 172.30.100.10 /24
> Node 2 (slave) in VLAN 2: 172.30.200.10 /24
> 
> Note: i deployed Centeron HA Cluster with Corosync/Pacemaker on same
> VLAN and it's working fine.
> my idea is to move Slave node on another site (VLAN 2).
> 
> Thank you in advance
> 
> 
> -- 
> 
> Adil Bouazzaoui
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] DRBD Cluster Problem

2023-08-10 Thread Ken Gaillot

On Thu, 2023-08-10 at 10:00 +, Tiaan Wessels wrote:

 - as a side question, if a situation resolved itself, is there a way
> to have pcs do a resource cleanup by itself ?

See failure-timeout:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#failure-response

The cluster has no way of knowing ahead of time whether the situation
is resolved -- it just cleans up the failure at the failure-timeout and
tries again.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Need a help with "(crm_glib_handler) crit: GLib: g_hash_table_lookup: assertion 'hash_table != NULL' failed"

2023-08-03 Thread Ken Gaillot

In the other case, the problem turned out to be a timing issue that can
occur when the DC and attribute writer are shutting down at the same
time. Since the problem in this case also occurred after shutting down
two nodes together, I'm thinking it's likely the same issue.

A fix should be straightforward. A workaround in the meantime would be
to shut down nodes in sequence rather than in parallel, when shutting
down just some nodes. (Shutting down the entire cluster shouldn't be
subject to the race condition.)

On Wed, 2023-08-02 at 16:53 -0500, Ken Gaillot wrote:
> Ha! I didn't realize crm_report saves blackboxes as text. Always
> something new to learn with Pacemaker :)
> 
> As of 2.1.5, the controller now gets agent metadata asynchronously,
> which fixed bugs with synchronous calls blocking the controller. Once
> the metadata action returns, the original action that required the
> metadata is attempted.
> 
> This led to the odd log messages. Normally, agent actions can't be
> attempted once the shutdown sequence begins. However, in this case,
> metadata actions were initiated before shutdown, but completed after
> shutdown began. The controller thus attempted the original actions
> after it had already disconnected from the executor, resulting in the
> odd logs.
> 
> The fix for that is simple, but addresses only the logs, not the
> original problem that caused the controller to shut down. I'm still
> looking into that.
> 
> I've since heard about a similar case, and I suspect in that case, it
> was related to having a node with an older version trying to join a
> cluster with a newer version.
> 
> On Fri, 2023-07-28 at 15:21 +0300, Novik Arthur wrote:
> >  2023-07-21_pacemaker_debug.log.vm01.bz2
> >  2023-07-21_pacemaker_debug.log.vm02.bz2
> >  2023-07-21_pacemaker_debug.log.vm03.bz2
> >  2023-07-21_pacemaker_debug.log.vm04.bz2
> >  blackbox_txt_vm04.tar.bz2
> > On Thu, Jul 27 12:06:42 EDT 2023, Ken Gaillot kgaillot at
> > redhat.com
> > wrote:
> > 
> > > Running "qb-blackbox /var/lib/pacemaker/blackbox/pacemaker-
> > controld-
> > > 4257.1" (my version can't read it) will show trace logs that
> > > might
> > give
> > > a better idea of what exactly went wrong at this time (though
> > > these
> > > issues are side effects, not the cause).
> > 
> > Blackboxes were attached to crm_report and they are in txt format.
> > Just in case adding them to this email.
> > 
> > > FYI, it's not necessary to set cluster-recheck-interval as low as
> > > 1
> > > minute. A long time ago that could be useful, but modern
> > > Pacemaker
> > > doesn't need it to calculate things such as failure expiration. I
> > > recommend leaving it at default, or at least raising it to 5
> > minutes or
> > > so.
> > 
> > That's good to know, since those rules came from pacemaker-1.x and
> > I'm an adept of the "don't touch if it works" rule
> > 
> > > vm02, vm03, and vm04 all left the cluster at that time, leaving
> > only
> > > vm01. At this point, vm01 should have deleted the transient
> > attributes
> > > for all three nodes. Unfortunately, the logs for that would only
> > > be
> > in
> > > pacemaker.log, which crm_report appears not to have grabbed, so I
> > am
> > > not sure whether it tried.
> > Please find debug logs for "Jul 21" from DC (vm01) and crashed node
> > (vm04) in an attachment.
> > > Thu, Jul 27 12:06:42 EDT 2023, Ken Gaillot kgaillot at redhat.com
> > wrote:
> > > On Wed, 2023-07-26 at 13:29 -0700, Reid Wahl wrote:
> > > > On Fri, Jul 21, 2023 at 9:51 AM Novik Arthur  > gmail.com>
> > > > wrote:
> > > > > Hello Andrew, Ken and the entire community!
> > > > > 
> > > > > I faced a problem and I would like to ask for help.
> > > > > 
> > > > > Preamble:
> > > > > I have dual controller storage (C0, C1) with 2 VM per
> > controller
> > > > > (vm0[1,2] on C0, vm[3,4] on C1).
> > > > > I did online controller upgrade (update the firmware on
> > physical
> > > > > controller) and for that purpose we have a special procedure:
> > > > > 
> > > > > Put all vms on the controller which will be updated into the
> > > > > standby mode (vm0[3,4] in logs).
> > > > > Once all resources are moved to spare controller VMs, turn on
> > > > > maintenance-mode (DC machine is vm01).
> > > > > Shutdown vm0[3,4] and perform

Re: [ClusterLabs] Need a help with "(crm_glib_handler) crit: GLib: g_hash_table_lookup: assertion 'hash_table != NULL' failed"

2023-08-02 Thread Ken Gaillot

Ha! I didn't realize crm_report saves blackboxes as text. Always
something new to learn with Pacemaker :)

As of 2.1.5, the controller now gets agent metadata asynchronously,
which fixed bugs with synchronous calls blocking the controller. Once
the metadata action returns, the original action that required the
metadata is attempted.

This led to the odd log messages. Normally, agent actions can't be
attempted once the shutdown sequence begins. However, in this case,
metadata actions were initiated before shutdown, but completed after
shutdown began. The controller thus attempted the original actions
after it had already disconnected from the executor, resulting in the
odd logs.

The fix for that is simple, but addresses only the logs, not the
original problem that caused the controller to shut down. I'm still
looking into that.

I've since heard about a similar case, and I suspect in that case, it
was related to having a node with an older version trying to join a
cluster with a newer version.

On Fri, 2023-07-28 at 15:21 +0300, Novik Arthur wrote:
>  2023-07-21_pacemaker_debug.log.vm01.bz2
>  2023-07-21_pacemaker_debug.log.vm02.bz2
>  2023-07-21_pacemaker_debug.log.vm03.bz2
>  2023-07-21_pacemaker_debug.log.vm04.bz2
>  blackbox_txt_vm04.tar.bz2
> On Thu, Jul 27 12:06:42 EDT 2023, Ken Gaillot kgaillot at redhat.com
> wrote:
> 
> > Running "qb-blackbox /var/lib/pacemaker/blackbox/pacemaker-
> controld-
> > 4257.1" (my version can't read it) will show trace logs that might
> give
> > a better idea of what exactly went wrong at this time (though these
> > issues are side effects, not the cause).
> 
> Blackboxes were attached to crm_report and they are in txt format.
> Just in case adding them to this email.
> 
> > FYI, it's not necessary to set cluster-recheck-interval as low as 1
> > minute. A long time ago that could be useful, but modern Pacemaker
> > doesn't need it to calculate things such as failure expiration. I
> > recommend leaving it at default, or at least raising it to 5
> minutes or
> > so.
> 
> That's good to know, since those rules came from pacemaker-1.x and
> I'm an adept of the "don't touch if it works" rule
> 
> > vm02, vm03, and vm04 all left the cluster at that time, leaving
> only
> > vm01. At this point, vm01 should have deleted the transient
> attributes
> > for all three nodes. Unfortunately, the logs for that would only be
> in
> > pacemaker.log, which crm_report appears not to have grabbed, so I
> am
> > not sure whether it tried.
> Please find debug logs for "Jul 21" from DC (vm01) and crashed node
> (vm04) in an attachment.
> > Thu, Jul 27 12:06:42 EDT 2023, Ken Gaillot kgaillot at redhat.com
> wrote:
> > On Wed, 2023-07-26 at 13:29 -0700, Reid Wahl wrote:
> > > On Fri, Jul 21, 2023 at 9:51 AM Novik Arthur  gmail.com>
> > > wrote:
> > > > Hello Andrew, Ken and the entire community!
> > > > 
> > > > I faced a problem and I would like to ask for help.
> > > > 
> > > > Preamble:
> > > > I have dual controller storage (C0, C1) with 2 VM per
> controller
> > > > (vm0[1,2] on C0, vm[3,4] on C1).
> > > > I did online controller upgrade (update the firmware on
> physical
> > > > controller) and for that purpose we have a special procedure:
> > > > 
> > > > Put all vms on the controller which will be updated into the
> > > > standby mode (vm0[3,4] in logs).
> > > > Once all resources are moved to spare controller VMs, turn on
> > > > maintenance-mode (DC machine is vm01).
> > > > Shutdown vm0[3,4] and perform firmware update on C1 (OS + KVM +
> > > > HCA/HBA + BMC drivers will be updated).
> > > > Reboot C1
> > > > Start vm0[3,4]
> > > > On this step I hit the problem.
> > > > Do the same steps for C0 (turn off maint, put nodes 3,4 to
> online,
> > > > put 1-2 to standby, maint and etc).
> > > > 
> > > > Here is what I observed during step 5.
> > > > Machine vm03 started without problems, but vm04 caught critical
> > > > error and HA stack died. If manually start the pacemaker one
> more
> > > > time then it starts without problems and vm04 joins the
> cluster.
> > > > 
> > > > Some logs from vm04:
> > > > 
> > > > Jul 21 04:05:39 vm04 corosync[3061]:  [QUORUM] This node is
> within
> > > > the primary component and will provide service.
> > > > Jul 21 04:05:39 vm04 corosync[3061]:  [QUORUM] Members[4]: 1 2
> 3 4
> > > > Jul 21 04:05:39 vm04 corosync[306

Re: [ClusterLabs] pcs node removal still crm_node it is removed node is listing as lost node

2023-07-27 Thread Ken Gaillot

On Thu, 2023-07-13 at 11:03 -0500, Ken Gaillot wrote:
> On Thu, 2023-07-13 at 09:58 +, S Sathish S via Users wrote:
> > Hi Team,
> >  
> > Problem Statement : we are trying to remove node on pcs cluster,
> > post
> > execution also still crm_node 
> > it is removed node is listing as lost node.
> >  
> > we have checked corosync.conf file it is removed but still it is
> > displaying on
> > crm_node -l.
> >  
> > [root@node1 ~]# pcs cluster node remove node2 --force
> > Destroying cluster on hosts: 'node2'...
> > node2: Successfully destroyed cluster
> > Sending updated corosync.conf to nodes...
> > node1: Succeeded
> > node1: Corosync configuration reloaded
> >  
> > [root@node1 ~]# crm_node -l
> > 1 node1 member
> > 2 node2 lost
> 
> This looks like a possible regression. The "node remove" command
> should
> erase all knowledge of the node, but I can reproduce this, and I
> don't
> see log messages I would expect. I'll have to investigate further.

This turned out to be a regression introduced in Pacemaker 2.0.5. It is
now fixed in the main branch by commit 3e31da00, expected to land in
2.1.7 toward the end of this year.

It only affected "crm_node -l".

> 
> >  
> > In RHEL 7.x we are using below rpm version not seeing this issue
> > while removing the node.
> > pacemaker-2.0.2-2.el7
> > corosync-2.4.4-2.el7
> > pcs-0.9.170-1.el7
> >  
> > In RHEL 8.x we are using below rpm version but seeing above issue
> > over here.
> > pacemaker-2.1.6-1.el8
> > corosync-3.1.7-1.el8
> > pcs-0.10.16-1.el8
> >  
> > Thanks and Regards,
> > S Sathish S
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Need a help with "(crm_glib_handler) crit: GLib: g_hash_table_lookup: assertion 'hash_table != NULL' failed"

2023-07-27 Thread Ken Gaillot

 of lrm[@id='4']: Resource state removal
> > Jul 21 04:05:47 vm01 pacemaker-schedulerd[4028]: notice: No fencing
> > will be done until there are resources to manage
> > Jul 21 04:05:47 vm01 pacemaker-schedulerd[4028]: notice:  *
> > Shutdown vm04

This is where things start to go wrong, and it has nothing to do with
any of the messages here. It means that the shutdown node attribute was
not erased when vm04 shut down the last time before this. Going back,
we see when that happened:

Jul 21 03:49:06 vm01 pacemaker-attrd[4017]: notice: Setting shutdown[vm04]: 
(unset) -> 1689911346

vm02, vm03, and vm04 all left the cluster at that time, leaving only
vm01. At this point, vm01 should have deleted the transient attributes
for all three nodes. Unfortunately, the logs for that would only be in
pacemaker.log, which crm_report appears not to have grabbed, so I am
not sure whether it tried.

> > Jul 21 04:05:47 vm01 pacemaker-schedulerd[4028]: notice: Calculated
> > transition 17, saving inputs in /var/lib/pacemaker/pengine/pe-
> > input-940.bz2

What's interesting in this transition is that we schedule probes on
vm04 even though we're shutting it down. That's a bug, and leads to the
"No executor connection" messages we see on vm04. I've added a task to
our project manager to take care of that. That's all a side effect
though and not causing any real problems.

> > As far as I understand, vm04 was killed by DC during the election
> > of a new attr writer?
> 
> Not sure yet, maybe someone else recognizes this.
> 
> I see the transition was aborted due to peer halt right after node
> vm04 joined. A new election started due to detection of node vm04 as
> attribute writer. Node vm04's resource state was removed, which is a
> normal part of the join sequence; this caused another transition
> abort
> message for the same transition number.
> 
> Jul 21 04:05:39 vm01 pacemaker-controld[4048]: notice: Node vm04
> state
> is now member
> ...
> Jul 21 04:05:39 vm01 corosync[3134]:  [KNET  ] pmtud: Global data MTU
> changed to: 1397
> Jul 21 04:05:39 vm01 pacemaker-controld[4048]: notice: Transition 16
> aborted: Peer Halt
> Jul 21 04:05:39 vm01 pacemaker-attrd[4017]: notice: Detected another
> attribute writer (vm04), starting new election
> Jul 21 04:05:39 vm01 pacemaker-attrd[4017]: notice: Setting
> #attrd-protocol[vm04]: (unset) -> 5
> ...
> Jul 21 04:05:41 vm01 pacemaker-controld[4048]: notice: Transition 16
> aborted by deletion of lrm[@id='4']: Resource state removal
> 
> Looking at pe-input-939 and pe-input-940, node vm04 was marked as
> shut down:
> 
> Jul 21 04:05:38 vm01 pacemaker-schedulerd[4028]: notice: Calculated
> transition 16, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-939.bz2
> Jul 21 04:05:44 vm01 pacemaker-controld[4048]: notice: Transition 16
> (Complete=24, Pending=0, Fired=0, Skipped=34, Incomplete=34,
> Source=/var/lib/pacemaker/pengine/pe-input-939.bz2): Stopped
> Jul 21 04:05:47 vm01 pacemaker-schedulerd[4028]: notice:  * Shutdown
> vm04
> Jul 21 04:05:47 vm01 pacemaker-schedulerd[4028]: notice: Calculated
> transition 17, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-940.bz2
> 
> 939:
>  crm-debug-origin="do_state_transition" join="down" expected="down">
>   
> 
>value="3.16.2"/>
>value="1689911346"/>
> 
>   
> 
> 940:
>  crm-debug-origin="do_state_transition" join="member"
> expected="member">
>   
> 
>value="3.16.2"/>
>value="1689911346"/>
> 
>   
> 
> I suppose that node vm04's state was not updated before the
> transition
> was aborted. So when the new transition (940) ran, the scheduler saw
> that node vm04 is expected to be in shutdown state, and it triggered
> a
> shutdown.
> 
> This behavior might already be fixed upstream by the following
> commit:
> https://github.com/ClusterLabs/pacemaker/commit/5e3b3d14
> 
> That commit introduced a regression, however, and I'm working on
> fixing it.

I suspect that's unrelated, because transient attributes are cleared
when a node leaves rather than when it joins.

> 
> 
> > The issue is reproducible from time to time and the version of
> > pacemaker is " 2.1.5-8.1.el8_8-a3f44794f94" from Rocky linux 8.8.
> > 
> > I attached crm_report with blackbox. I have debug logs, but they
> > are pretty heavy (~40MB bzip --best). Please tell me if you need
> > them.
> > 
> > Thanks,
> > Arthur
> > 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Pacemaker fatal shutdown

2023-07-25 Thread Ken Gaillot

On Thu, 2023-07-20 at 12:43 +0530, Priyanka Balotra wrote:
> What I mainly want to understand is that:
> - why "fatal failure" is coming 

The logs so far don't show that. The earliest sign is:

Jul 17 14:18:20.085 FILE-6 pacemaker-fenced[19411]
(remote_op_done)   notice: Operation 'reboot' targeting FILE-2 by FILE-
4 for pacemaker-controld.19415@FILE-6: OK | id=4e523b34

You'd want to figure out which node was the Designated Controller (DC)
at that time, and look at its logs before this time. The DC will have
"Calculated transition" log messages.

You want to find such messages just before the timestamp above. If you
look above the "Calculated transition" message, it will show what
actions the cluster wants to take, including fencing. The logs around
there should say why the fencing was needed.

> - why does pacemaker not start on the node after a node boots
> followed by  "pacemaker fatal failure" .

A fatal failure is one where Pacemaker should stay down, so that's what
it does. In this case, fencing completed against the node, but the node
was still alive, so it shuts down and waits for manual intervention to
figure out what happened.

> - How can this be handled?

In a situation like this, figure out (1) why fencing was needed and (2)
why successful fencing did not kill the node (if you're using fabric
fencing such as SCSI fencing, that could be a reason, otherwise it
might be a misconfiguration).

Once you know that, it should be fairly obvious what to do about it,
and once it's taken care of, you can manually start Pacemaker on the
node again.

> 
> Thanks
> Priyanka
> 
> On Thu, Jul 20, 2023 at 12:41 PM Priyanka Balotra <
> priyanka.14balo...@gmail.com> wrote:
> > Hi, 
> > 
> > Here are FILE-6 logs: 
> > 
> > 65710:Jul 17 14:16:51.517 FILE-6 pacemaker-controld  [19415]
> > (throttle_mode)debug: Current load is 0.76 across 10
> > core(s)
> > 65711:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (throttle_update)  debug: Node FILE-2 has negligible load and
> > supports at most 20 jobs; new job limit 20
> > 65712:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (handle_request)   debug: The throttle changed. Trigger a graph.
> > 65713:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (pcmk__set_flags_as)   debug: FSA action flags 0x0002
> > (new_actions) for controller set by s_crmd_fsa:198
> > 65714:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (s_crmd_fsa)   debug: Processing I_JOIN_REQUEST: [
> > state=S_INTEGRATION cause=C_HA_MESSAGE origin=route_message ]
> > 65715:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (pcmk__clear_flags_as) debug: FSA action flags 0x0002
> > (an_action) for controller cleared by do_fsa_action:108
> > 65716:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (do_dc_join_filter_offer)  debug: Accepting join-1 request from
> > FILE-2 | ref=join_request-crmd-1689603392-8
> > 65717:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (pcmk__update_peer_expected)   info: do_dc_join_filter_offer:
> > Node FILE-2[2] - expected state is now member (was (null))
> > 65718:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (do_dc_join_filter_offer)  debug: 2 nodes currently integrated in
> > join-1
> > 65719:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (check_join_state) debug: join-1: Integration of 2 peers
> > complete | state=S_INTEGRATION for=do_dc_join_filter_offer
> > 65720:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (pcmk__set_flags_as)   debug: FSA action flags 0x0004
> > (new_actions) for controller set by s_crmd_fsa:198
> > 65721:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (s_crmd_fsa)   debug: Processing I_INTEGRATED: [
> > state=S_INTEGRATION cause=C_FSA_INTERNAL origin=check_join_state ]
> > 65722:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (do_state_transition)  info: State transition S_INTEGRATION ->
> > S_FINALIZE_JOIN | input=I_INTEGRATED cause=C_FSA_INTERNAL
> > origin=check_join_state
> > 65723:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (pcmk__set_flags_as)   debug: FSA action flags 0x0020
> > (A_INTEGRATE_TIMER_STOP) for controller set by
> > do_state_transition:559
> > 65724:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (pcmk__set_flags_as)   debug: FSA action flags 0x0040
> > (A_FINALIZE_TIMER_START) for controller set by
> > do_state_transition:563
> > 65725:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (pcmk__set_flags_as)   debug: FSA action flags 0x0200
> > (A_DC_TIMER_STOP) for controller set by do_state_transition:569
> > 65726:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (do_state_transition)  debug: All cluster nodes (2) responded
> > to join offer
> > 65727:Jul 17 14:16:55.085 FILE-6 pacemaker-controld  [19415]
> > (pcmk__clear_flags_as) debug: FSA action flags 0x0200

Re: [ClusterLabs] Fencing issue with dlm resources in pacemaker cluster.

2023-07-25 Thread Ken Gaillot

On Sat, 2023-07-22 at 08:33 +, Sai Siddhartha Peesapati wrote:
> This is a 3-node cluster with gfs2 filesystem resources configured
> using dlm and clvmd. Stonith is enabled. dlm and gfs2 resources are
> set to fence on failures.
> Pacemaker version on the cluster - 2.1.4-5.el8 (CentOS 8 Stream)
> 
> Pacemaker default fence actions are set to power off the node instead
> of a reboot by setting "pcmk_reboot_action=off" on IPMI fence
> devices. Also, cluster wide stonith default action is set to poweroff
> the node instead of a reboot
> # pcs property config |grep stonith
>  stonith-action: off
>  stonith-enabled: true
> 
> But still the node is rebooting as dlm_controld resource is sending a
> reboot action as part of fencing. Below are the logs obtained from
> syslog to confirm the same.

Hi,

Unfortunately, only fencing initiated by Pacemaker itself uses stonith-
action, pcmk_reboot_action, and similar options. External software
packages such as DLM specify the action they want when requesting
fencing. You can check the manual for the relevant software to see if
they provide configuration for the action to use.

It would be possible to create a new Pacemaker interface that allows
external software to say "use the default action" but it would require
coordination with the developers of the external software to test for
support and use it.

A workaround would be to make a copy of the fence agent you are using,
and map "reboot" to "off" before doing anything else.

> 
> dlm_controld[73542]: 813117 fence request 3 pid 1695739 nodedown time
> 1690006193 fence_all dlm_stonith
> pacemaker-fenced[15706]: notice: Client stonith-api.1695739 wants to
> fence (reboot) 3 using any device
> 
> I need the node to be powered off from fencing operations rather than
> a reboot. Disabling fencing on dlm resources is not an option. Is
> there any other way to solve this and make dlm issue a poweroff
> action instead of a reboot as part of fencing. 
> 
> Regards
> P V K Sai Siddhartha
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Pacemaker fatal shutdown

2023-07-19 Thread Ken Gaillot

On Wed, 2023-07-19 at 23:49 +0530, Priyanka Balotra wrote:
> Hi All, 
> I am using SLES 15 SP4. One of the nodes of the cluster is brought
> down and boot up after sometime. Pacemaker service came up first but
> later it faced a fatal shutdown. Due to that crm service is down. 
> 
> The logs from /var/log/pacemaker.pacemaker.log are as follows:
> 
> Jul 17 14:18:20.093 FILE-2 pacemakerd  [15956]
> (pcmk_child_exit)warning: Shutting cluster down because
> pacemaker-controld[15962] had fatal failure

The interesting messages will be before this. The ones with "pacemaker-
controld" will be the most relevant, at least initially.

> Jul 17 14:18:20.093 FILE-2 pacemakerd  [15956]
> (pcmk_shutdown_worker)   notice: Shutting down Pacemaker
> Jul 17 14:18:20.093 FILE-2 pacemakerd  [15956]
> (pcmk_shutdown_worker)   debug: pacemaker-controld confirmed stopped
> Jul 17 14:18:20.093 FILE-2 pacemakerd  [15956] (stop_child)  
>   notice: Stopping pacemaker-schedulerd | sent signal 15 to process
> 15961
> Jul 17 14:18:20.093 FILE-2 pacemaker-schedulerd[15961]
> (crm_signal_dispatch)notice: Caught 'Terminated' signal | 15
> (invoking handler)
> Jul 17 14:18:20.093 FILE-2 pacemaker-schedulerd[15961]
> (qb_ipcs_us_withdraw)info: withdrawing server sockets
> Jul 17 14:18:20.093 FILE-2 pacemaker-schedulerd[15961]
> (qb_ipcs_unref)  debug: qb_ipcs_unref() - destroying
> Jul 17 14:18:20.093 FILE-2 pacemaker-schedulerd[15961]
> (crm_xml_cleanup)info: Cleaning up memory from libxml2
> Jul 17 14:18:20.093 FILE-2 pacemaker-schedulerd[15961] (crm_exit)
>   info: Exiting pacemaker-schedulerd | with status 0
> Jul 17 14:18:20.093 FILE-2 pacemaker-based [15957]
> (qb_ipcs_event_sendv)debug: new_event_notification (/dev/shm/qb-
> 15957-15962-12-RDPw6O/qb): Broken pipe (32)
> Jul 17 14:18:20.093 FILE-2 pacemaker-based [15957]
> (cib_notify_send_one)warning: Could not notify client crmd:
> Broken pipe | id=e29d175e-7e91-4b6a-bffb-fabfdd7a33bf
> Jul 17 14:18:20.093 FILE-2 pacemaker-based [15957]
> (cib_process_request)info: Completed cib_delete operation for
> section //node_state[@uname='FILE-2']/*: OK (rc=0, origin=FILE-
> 6/crmd/74, version=0.24.75)
> Jul 17 14:18:20.093 FILE-2 pacemaker-fenced[15958]
> (xml_patch_version_check)debug: Can apply patch 0.24.75 to
> 0.24.74
> Jul 17 14:18:20.093 FILE-2 pacemakerd  [15956]
> (pcmk_child_exit)info: pacemaker-schedulerd[15961] exited
> with status 0 (OK)
> Jul 17 14:18:20.093 FILE-2 pacemaker-based [15957]
> (cib_process_request)info: Completed cib_modify operation for
> section status: OK (rc=0, origin=FILE-6/crmd/75, version=0.24.75)
> Jul 17 14:18:20.093 FILE-2 pacemakerd  [15956]
> (pcmk_shutdown_worker)   debug: pacemaker-schedulerd confirmed
> stopped
> Jul 17 14:18:20.093 FILE-2 pacemakerd  [15956] (stop_child)  
>   notice: Stopping pacemaker-attrd | sent signal 15 to process 15960
> Jul 17 14:18:20.093 FILE-2 pacemaker-attrd [15960]
> (crm_signal_dispatch)notice: Caught 'Terminated' signal | 15
> (invoking handler)
> 
> Could you please help me understand the issue here.
> 
> Regards
> Priyanka
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] FEEDBACK WANTED: possible deprecation of nagios-class resources

2023-07-17 Thread Ken Gaillot

On Fri, 2023-07-14 at 17:24 +0800, Mr.R wrote:
> Hi,
> 
> Thanks for your answer, it supplys some useful ideas about nagios.
> And, there are some questions that need to communicate.
> 
> 1. Is there a specific use case for bundle? No official use case has
> been found so far. Could you please provide it?

Bundles are the preferred method for launching containerized
applications.

A bundle lets you configure a single resource specifying the number of
container instances desired, whether they need individual IP addresses
and/or host storage, and the application to be monitored inside the
container. See:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#bundles-containerized-resources

> 
> 2. What are the nagios plugins that pacemaker has supported in the
> past?  As far as I know, there are only a few naigos plugins about
> network,like check_tcp, check_udp and so on. Are there any complex
> monitoring plugins such as databases or services? Could you please
> provide it? If not, will you consider adding something similar?

Pacemaker supports any nagios plugin, as long as a file is available
listing its options in OCF meta-data format. The nagios-plugins-
metadata project provides such files for many common plugins.

> If nagios is deprecated, are there any other scripts or templates
> that the community might develop for to monitor virtual machines's
> services or databases?

The ocf:pacemaker:VirtualDomain agent has a "monitor_scripts" parameter
that lets you provide arbitrary scripts to monitor whatever is inside
the virtual domain. The only requirement is that the scripts must exit
with zero status on success and nonzero on error. The scripts have
access to Pacemaker's environment variables, so they can get the
parameter values used by the agent.

That means you could write a trivial script that calls any nagios
plugin you want using the relevant parameters, then returns success or
failure.

> 
> thanks again.
> 
> >Hi,
> 
> >Thanks for the feedback, it's very helpful to hear about a real-
> world use case.
> 
> >At this point (and for any remaining 2.1 releases), the only effect
> is a warning in the logs when nagios resources are configured.
> Eventually (probably next year), there will be a 2.2.0 or 3.0.0
> release, and we can consider dropping support then.
> 
> >The idea was that since nagios resources were first introduced,
> better solutions (such as Pacemaker Remote and bundles) have emerged
> for particular use cases. Being able to reduce how much code needs to
> be maintained can be a big help to developers, so if the remaining
> use cases aren't widely needed, it can be worthwhile to drop support.
> 
> >The main use case where nagios resources can still be helpful is
> (as you mentioned) when a VM or container image can't be modified.
> 
> >The alternative would be to write a custom OCF agent for the
> image. Basically you could take the usual OCF agent (VirtualDomain,
> docker,podman, etc.) and change the monitor action to do whatever you
> want. If you have an existing nagios check, you could even just call
> that from the monitor action, so it would be just a few lines of
> coding.
> 
> >If custom agents are not convenient enough, we could consider "un-
> deprecating" nagios resources if there is demand to keep them.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] pcs node removal still crm_node it is removed node is listing as lost node

2023-07-13 Thread Ken Gaillot

On Thu, 2023-07-13 at 09:58 +, S Sathish S via Users wrote:
> Hi Team,
>  
> Problem Statement : we are trying to remove node on pcs cluster, post
> execution also still crm_node 
> it is removed node is listing as lost node.
>  
> we have checked corosync.conf file it is removed but still it is
> displaying on
> crm_node -l.
>  
> [root@node1 ~]# pcs cluster node remove node2 --force
> Destroying cluster on hosts: 'node2'...
> node2: Successfully destroyed cluster
> Sending updated corosync.conf to nodes...
> node1: Succeeded
> node1: Corosync configuration reloaded
>  
> [root@node1 ~]# crm_node -l
> 1 node1 member
> 2 node2 lost

This looks like a possible regression. The "node remove" command should
erase all knowledge of the node, but I can reproduce this, and I don't
see log messages I would expect. I'll have to investigate further.

>  
> In RHEL 7.x we are using below rpm version not seeing this issue
> while removing the node.
> pacemaker-2.0.2-2.el7
> corosync-2.4.4-2.el7
> pcs-0.9.170-1.el7
>  
> In RHEL 8.x we are using below rpm version but seeing above issue
> over here.
> pacemaker-2.1.6-1.el8
> corosync-3.1.7-1.el8
> pcs-0.10.16-1.el8
>  
> Thanks and Regards,
> S Sathish S
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] newly created clone waits for off-lined node

2023-07-13 Thread Ken Gaillot

On Wed, 2023-07-12 at 21:08 +0200, lejeczek via Users wrote:
> Hi guys.
> 
> I have a fresh new 'galera' clone and that one would not start &
> cluster says:
> ...
> INFO: Waiting on node  to report database status before Master
> instances can start.
> ...
> 
> Is that only for newly created resources - which I guess it must be -
> and if so then why?
> Naturally, next question would be - how to make such resource start
> in that very circumstance?
> 
> many thank, L.

That is part of the agent rather than Pacemaker. Looking at the agent
code, it's based on a node attribute the agent sets, so it is only
empty for newly created resources that haven't yet run on a node. I'm
not sure if there's a way around it. (Anyone else have experience with
that?)
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] location constraint does not move promoted resource ?

2023-07-03 Thread Ken Gaillot

On Mon, 2023-07-03 at 19:22 +0300, Andrei Borzenkov wrote:
> On 03.07.2023 18:07, Ken Gaillot wrote:
> > On Mon, 2023-07-03 at 12:20 +0200, lejeczek via Users wrote:
> > > On 03/07/2023 11:16, Andrei Borzenkov wrote:
> > > > On 03.07.2023 12:05, lejeczek via Users wrote:
> > > > > Hi guys.
> > > > > 
> > > > > I have pgsql with I constrain like so:
> > > > > 
> > > > > -> $ pcs constraint location PGSQL-clone rule role=Promoted
> > > > > score=-1000 gateway-link ne 1
> > > > > 
> > > > > and I have a few more location constraints with that
> > > > > ethmonitor & those work, but this one does not seem to.
> > > > > When contraint is created cluster is silent, no errors nor
> > > > > warning, but relocation does not take place.
> > > > > I can move promoted resource manually just fine, to that
> > > > > node where 'location' should move it.
> > > > > 
> > > > 
> > > > Instance to promote is selected according to promotion
> > > > scores which are normally set by resource agent.
> > > > Documentation implies that standard location constraints
> > > > are also taken in account, but there is no explanation how
> > > > promotion scores interoperate with location scores. It is
> > > > possible that promotion score in this case takes precedence.
> > > It seems to have kicked in with score=-1 but..
> > > that was me just guessing.
> > > Indeed it would be great to know how those are calculated,
> > > in a way which would' be admin friendly or just obvious.
> > > 
> > > thanks, L.
> > 
> > It's a longstanding goal to have some sort of tool for explaining
> > how
> > scores interact in a given situation. However it's a challenging
> > problem and there's never enough time ...
> > 
> > Basically, all scores are added together for each node, and the
> > node
> > with the highest score runs the resource, subject to any placement
> > strategy configured. These mainly include stickiness, location
> > constraints, colocation constraints, and node health. Nodes may be
> 
> And you omitted the promotion scores which was the main question.

Oh right -- first, the above is used to determine the nodes on which
clone instances will be placed. After that, an appropriate number of
nodes are selected for the promoted role, based on promotion scores and
location and colocation constraints for the promoted role.

When colocations are considered, chained colocations are considered at
an attenuated score. If A is colocated with B, and B is colocated with
C, A's preferences are considered when assigning C to a node, but at
less than full strength. That's one of the reasons it gets complicated
to figure out a particular situation.

> > eliminated from consideration by resource migration thresholds,
> > standby/maintenance mode, etc.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] location constraint does not move promoted resource ?

2023-07-03 Thread Ken Gaillot

On Mon, 2023-07-03 at 12:20 +0200, lejeczek via Users wrote:
> 
> On 03/07/2023 11:16, Andrei Borzenkov wrote:
> > On 03.07.2023 12:05, lejeczek via Users wrote:
> > > Hi guys.
> > > 
> > > I have pgsql with I constrain like so:
> > > 
> > > -> $ pcs constraint location PGSQL-clone rule role=Promoted
> > > score=-1000 gateway-link ne 1
> > > 
> > > and I have a few more location constraints with that
> > > ethmonitor & those work, but this one does not seem to.
> > > When contraint is created cluster is silent, no errors nor
> > > warning, but relocation does not take place.
> > > I can move promoted resource manually just fine, to that
> > > node where 'location' should move it.
> > > 
> > 
> > Instance to promote is selected according to promotion 
> > scores which are normally set by resource agent. 
> > Documentation implies that standard location constraints 
> > are also taken in account, but there is no explanation how 
> > promotion scores interoperate with location scores. It is 
> > possible that promotion score in this case takes precedence.
> It seems to have kicked in with score=-1 but..
> that was me just guessing.
> Indeed it would be great to know how those are calculated, 
> in a way which would' be admin friendly or just obvious.
> 
> thanks, L.

It's a longstanding goal to have some sort of tool for explaining how
scores interact in a given situation. However it's a challenging
problem and there's never enough time ...

Basically, all scores are added together for each node, and the node
with the highest score runs the resource, subject to any placement
strategy configured. These mainly include stickiness, location
constraints, colocation constraints, and node health. Nodes may be
eliminated from consideration by resource migration thresholds,
standby/maintenance mode, etc.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] FEEDBACK WANTED: possible deprecation of nagios-class resources

2023-07-03 Thread Ken Gaillot

Hi,

Thanks for the feedback, it's very helpful to hear about a real-world
use case.

At this point (and for any remaining 2.1 releases), the only effect is
a warning in the logs when nagios resources are configured. Eventually
(probably next year), there will be a 2.2.0 or 3.0.0 release, and we
can consider dropping support then.

The idea was that since nagios resources were first introduced, better
solutions (such as Pacemaker Remote and bundles) have emerged for
particular use cases. Being able to reduce how much code needs to be
maintained can be a big help to developers, so if the remaining use
cases aren't widely needed, it can be worthwhile to drop support.

The main use case where nagios resources can still be helpful is (as
you mentioned) when a VM or container image can't be modified.

The alternative would be to write a custom OCF agent for the image.
Basically you could take the usual OCF agent (VirtualDomain, docker,
podman, etc.) and change the monitor action to do whatever you want. If
you have an existing nagios check, you could even just call that from
the monitor action, so it would be just a few lines of coding.

If custom agents are not convenient enough, we could consider "un-
deprecating" nagios resources if there is demand to keep them.

On Fri, 2023-06-30 at 10:51 +0800, Mr.R via Users wrote:
> Hi Ken Gaillot
> 
> There are a few questions about nagios,
> In pacemaker-2.1.6, the nagios-class resource may be deprecated.
> Is it now or phased out? In some cases, if the machine being
> monitored cannot be modified, nagios can solve the problem.  Is it
> deprecated because there are fewer application scenarios? and will be
> there an alternative resource template?
> 
> thanks,
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] no-quorum-policy=ignore is (Deprecated ) and replaced with other options but not an effective solution

2023-06-27 Thread Ken Gaillot

\params
> > > >>> pcmk_delay_base=5s.*
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> *.property cib-bootstrap-options: \have-watchdog=true 
> > > \
> > > >>>
> > > >> dc-version="2.1.2+20211124.ada5c3b36-150400.2.43-
> > > 2.1.2+20211124.ada5c3b36"
> > > >>> \cluster-infrastructure=corosync \cluster-
> > > name=FILE \
> > > >>> stonith-enabled=true \stonith-timeout=172 \
> > > >>> stonith-action=reboot \stop-all-resources=false \
> > > >>> no-quorum-policy=ignorersc_defaults build-resource-defaults:
> > > \
> > > >>> resource-stickiness=1rsc_defaults rsc-options: \
> > > >>> resource-stickiness=100 \migration-threshold=3 \
> > > >>> failure-timeout=1m \cluster-recheck-
> > > interval=10minop_defaults
> > > >>> op-options: \timeout=600 \record-
> > > pending=true*
> > > >>>
> > > >>> On a 4-node setup when the whole cluster is brought up
> > > together we see
> > > >>> error logs like:
> > > >>>
> > > >>> *2023-06-26T11:35:17.231104+00:00 FILE-1 pacemaker-
> > > schedulerd[26359]:
> > > >>> warning: Fencing and resource management disabled due to lack
> > > of quorum*
> > > >>>
> > > >>> *2023-06-26T11:35:17.231338+00:00 FILE-1 pacemaker-
> > > schedulerd[26359]:
> > > >>> warning: Ignoring malformed node_state entry without uname*
> > > >>>
> > > >>> *2023-06-26T11:35:17.233771+00:00 FILE-1 pacemaker-
> > > schedulerd[26359]:
> > > >>> warning: Node FILE-2 is unclean!*
> > > >>>
> > > >>> *2023-06-26T11:35:17.233857+00:00 FILE-1 pacemaker-
> > > schedulerd[26359]:
> > > >>> warning: Node FILE-3 is unclean!*
> > > >>>
> > > >>> *2023-06-26T11:35:17.233957+00:00 FILE-1 pacemaker-
> > > schedulerd[26359]:
> > > >>> warning: Node FILE-4 is unclean!*
> > > >>>
> > > >>
> > > >> According to this output FILE-1 lost connection to three other
> > > nodes, in
> > > >> which case it cannot be quorate.
> > > >>
> > > >>>
> > > >>> Kindly help correct the configuration to make the system
> > > function
> > > >> normally
> > > >>> with all resources up, even if there is just one node up.
> > > >>>
> > > >>> Please let me know if any more info is needed.
> > > >>>
> > > >>> Thanks
> > > >>> Priyanka
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] no-quorum-policy=ignore is (Deprecated ) and replaced with other options but not an effective solution

2023-06-27 Thread Ken Gaillot

 pacemaker-
> > schedulerd[26359]:
> > > warning: Node FILE-4 is unclean!*
> > > 
> > 
> > According to this output FILE-1 lost connection to three other
> > nodes, in 
> > which case it cannot be quorate.
> > 
> > > 
> > > Kindly help correct the configuration to make the system function
> > normally
> > > with all resources up, even if there is just one node up.
> > > 
> > > Please let me know if any more info is needed.
> > > 
> > > Thanks
> > > Priyanka
> > > 
> > > 
> > > ___
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > 
> > > ClusterLabs home: https://www.clusterlabs.org/
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Pacemaker logs written on message which is not expected as per configuration

2023-06-26 Thread Ken Gaillot

On Mon, 2023-06-26 at 08:46 +0200, Klaus Wenninger wrote:
> 
> 
> On Fri, Jun 23, 2023 at 3:57 PM S Sathish S via Users <
> users@clusterlabs.org> wrote:
> > Hi Team,
> >  
> > The pacemaker logs is written in both '/var/log/messages' and
> > '/var/log/pacemaker/pacemaker.log'.
> > Could you please help us for not write pacemaker processes in
> > /var/log/messages? Even corosync configuration we have set
> > to_syslog: no.
> > Attached the corosync.conf file.
> >  
> > Pacemaker 2.1.6
> >  
> > [root@node1 username]# tail -f /var/log/messages
> > Jun 23 13:45:38 node1 ESAFMA_RA(ESAFMA_node1)[3593054]: INFO:
> >  component is running with 10502  number
> > Jun 23 13:45:38 node1
> > HealthMonitor_RA(HEALTHMONITOR_node1)[3593055]: INFO: Health
> > Monitor component is running with 3046  number
> > Jun 23 13:45:38 node1 ESAPMA_RA(ESAPMA_OCC)[3593056]: INFO: 
> > component is running with 10902  number
> > Jun 23 13:45:38 node1 HP_AMSD_RA(HP_AMSD_node1)[3593057]: INFO:
> >  component is running with 2540  number
> > Jun 23 13:45:38 node1 HP_SMAD_RA(HP_SMAD_node1)[3593050]: INFO:
> >  component is running with 2536  number
> > Jun 23 13:45:38 node1 SSMAGENT_RA(SSMAGENT_node1)[3593068]: INFO:
> >  component is running with 2771  number
> > Jun 23 13:45:38 node1 HazelCast_RA(HAZELCAST_node1)[3593059]: INFO:
> >  component is running with 13355 number
> > Jun 23 13:45:38 node1 HP_SMADREV_RA(HP_SMADREV_node1)[3593062]:
> > INFO:  component is running with 2735  number
> > Jun 23 13:45:38 node1 ESAMA_RA(ESAMA_node1)[3593065]: INFO: 
> > component is running with 9572  number
> > Jun 23 13:45:38 node1 MANAGER_RA(MANAGER_OCC)[3593071]: INFO:
> >  component is running with 10069 number
> > 
> 
> What did you configure in /etc/sysconfig/pacemaker?
>   PCMK_logfacility=none
> should disable all syslogging. 

It's worth mentioning that the syslog gets only the most serious
messages. By default these are critical, error, warning, and notice
level, but you can change that by setting PCMK_logpriority. For example
with PCMK_logpriority=error you will get only critical and error
messages in syslog.

> 
> Klaus
> >  
> >  
> > cat /etc/corosync/corosync.conf
> > totem {
> > version: 2
> > cluster_name: OCC
> > transport: knet
> > crypto_cipher: aes256
> > crypto_hash: sha256
> > cluster_uuid: 20572748740a4ac2a7bcc3a3bb6889e9
> > }
> >  
> > nodelist {
> > node {
> > ring0_addr: node1
> > name: node1
> > nodeid: 1
> > }
> > }
> >  
> > quorum {
> > provider: corosync_votequorum
> > }
> >  
> > logging {
> > to_logfile: yes
> > logfile: /var/log/cluster/corosync.log
> > to_syslog: no
> > timestamp: on
> > }
> >  
> > Thanks and Regards,
> > S Sathish S
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] host in standby causes havoc

2023-06-15 Thread Ken Gaillot

On Thu, 2023-06-15 at 12:58 +0200, Kadlecsik József wrote:
> Hello,
> 
> We had a strange issue here: 7 node cluster, one node was put into
> standby 
> mode to test a new iscsi setting on it. During configuring the
> machine it 
> was rebooted and after the reboot the iscsi didn't come up. That
> caused a 
> malformed communication (atlas5 is the node in standby) with the
> cluster:
> 
> Jun 15 10:10:13 atlas0 pacemaker-schedulerd[7153]:  warning:
> Unexpected 
> result (error) was recorded for probe of ocsi on atlas5 at Jun 15
> 10:09:32 2023
> Jun 15 10:10:13 atlas0 pacemaker-schedulerd[7153]:  notice: If it is
> not 
> possible for ocsi to run on atlas5, see the resource-discovery option
> for 
> location constraints
> Jun 15 10:10:13 atlas0 pacemaker-schedulerd[7153]:  error: Resource
> ocsi 
> is active on 2 nodes (attempting recovery)

Newer versions reword this as "might be active". The idea is that if
the probe returns an error, we don't know the state of the resource on
that node. From an HA perspective, we have to assume the worst, that
the resource could be active there.

> The resource was definitely not active on 2 nodes. And that caused a
> storm 
> of killing all virtual machines as resources.

The cluster would first try to stop ocsi on that node as well as the
node where it's known to be running. If a stop fails, then the cluster
will try to fence that node.

> How could one prevent such cases to come up?

It sounds like maybe the agent can't probe or stop in certain
situations. It may be possible to improve the agent. For example, some
agents return an error if key software isn't installed, but for a probe
or stop, that's fine -- if the software isn't installed, it's
definitely not running.

> 
> Best regards,
> Jozsef
> --
> E-mail : kadlecsik.joz...@wigner.hu
> PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt
> Address: Wigner Research Centre for Physics
>  H-1525 Budapest 114, POB. 49, Hungary
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] cluster okey but errors when tried to move resource - ?

2023-06-12 Thread Ken Gaillot

On Fri, 2023-06-09 at 14:03 +0200, Ondrej Mular wrote:
> To me, this seems like an issue in `crm_resource` as the error
> message
> comes from it. Pcs is actually using `crm_resource --move` when
> moving
> resources. In this case, pcs should call `crm_resource --move
> REDIS-clone --node podnode3 --master`, you can see that if you run
> pcs
> with `--debug` option. I guess `crm_resource --move --master` creates
> a location constraint with `role="Promoted"` and doesn't take into
> account the currently used schema. However, I'm unable to test this
> theory as I don't have any testing environment available at the
> moment.
> 
> Ondrej

Ah yes, you are correct. crm_resource does not check the current CIB
first, it just submits a change. It bases the role name on whether
pacemaker was built with the --enable-compat-2.0 flag, which is true
for el8 and false for el9.

We could change it to try with the new name first, and if that fails,
retry with the old name.

> 
> On Fri, 9 Jun 2023 at 01:39, Reid Wahl  wrote:
> > On Thu, Jun 8, 2023 at 2:24 PM lejeczek via Users <
> > users@clusterlabs.org> wrote:
> > > 
> > > 
> > > > Ouch.
> > > > 
> > > > Let's see the full output of the move command, with the whole
> > > > CIB that
> > > > failed to validate.
> > > > 
> > > For a while there I thought perhaps it was just that one
> > > pglsq resource, but it seems that any - though only a few
> > > are set up - (only clone promoted?)resource fails to move.
> > > Perhaps primarily to do with 'pcs'
> > > 
> > > -> $ pcs resource move REDIS-clone --promoted podnode3
> > > Error: cannot move resource 'REDIS-clone'
> > > 1  > > validate-with="pacemaker-3.6" epoch="8212" num_updates="0"
> > > admin_epoch="0" cib-last-written="Thu Jun  8 21:59:53 2023"
> > > update-origin="podnode1" update-client="crm_attribute"
> > > have-quorum="1" update-user="root" dc-uuid="1">
> > 
> > This is the problem: `validate-with="pacemaker-3.6"`. That old
> > schema
> > doesn't support role="Promoted" in a location constraint. Support
> > begins with version 3.7 of the schema:
> > https://github.com/ClusterLabs/pacemaker/commit/e7f1424df49ac41b2d38b72af5ff9ad5121432d2.
> > 
> > You'll need at least Pacemaker 2.1.0.
> > 
> > > 2   
> > > 3 
> > > 4   
> > > 5  > > id="cib-bootstrap-options-have-watchdog"
> > > name="have-watchdog" value="false"/>
> > > 6  > > name="dc-version" value="2.1.6-2.el9-6fdc9deea29"/>
> > > 7  > > id="cib-bootstrap-options-cluster-infrastructure"
> > > name="cluster-infrastructure" value="corosync"/>
> > > 
> > > crm_resource: Error performing operation: Invalid configuration
> > > 
> > > ___
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > 
> > > ClusterLabs home: https://www.clusterlabs.org/
> > 
> > 
> > --
> > Regards,
> > 
> > Reid Wahl (He/Him)
> > Senior Software Engineer, Red Hat
> > RHEL High Availability - Pacemaker
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] cluster okey but errors when tried to move resource - ?

2023-06-05 Thread Ken Gaillot

On Sat, 2023-06-03 at 15:09 +0200, lejeczek via Users wrote:
> Hi guys.
> 
> I've something which I'm new to entirely - cluster which is seemingly
> okey errors, fails to move a resource.

What pcs version are you using? I believe there was a move regression
in a recent push.

> I'd won't contaminate here just yet with long json cluster spits when
> fails but a snippet:
> 
> -> $ pcs resource move PGSQL-clone --promoted podnode1
> Error: cannot move resource 'PGSQL-clone'
>1  epoch="8109" num_updates="0" admin_epoch="0" cib-last-written="Sat
> Jun  3 13:49:34 2023" update-origin="podnode2" update-
> client="cibadmin" have-quorum="1" update-user="root" dc-uuid="2">
>2   
>3 
>4   
>5  name="have-watchdog" value="false"/>
>6 
>7 
>8  name="cluster-name" value="podnodes"/>
>9  name="stonith-enabled" value="false"/>
>   10  name="last-lrm-refresh" value="1683293193"/>
>   11  name="maintenance-mode" value="false"/>
>   12   
>   13   
>   14  name="redis_REPL_INFO" value="c8kubernode1"/>
>   15  name="REDIS_REPL_INFO" value="podnode3"/>
>   16   
>   17 
> ...
> crm_resource: Error performing operation: Invalid configuration
> 
> This one line: (might be more)
>  value="c8kubernode1"/>
> puzzles me, as there is no such node/member in the cluster and so I
> try:

That's not a problem. Pacemaker allows "custom" values in both cluster
options and resource/action meta-attributes. I don't know whether redis
is actually using that or not.

> 
> -> $ pcs property unset redis_REPL_INFO --force
> Warning: Cannot remove property 'redis_REPL_INFO', it is not present
> in property set 'cib-bootstrap-options'

That's because the custom options are in their own
cluster_property_set. I believe pcs can only manage the options in the
cluster_property_set with id="cib-bootstrap-options", so you'd have to
use "pcs cluster edit" or crm_attribute to remove the custom ones.

> 
> Any & all suggestions on how to fix this are much appreciated.
> many thanks, L.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.6 final release now available

2023-05-24 Thread Ken Gaillot

Hi all,

The final release of Pacemaker 2.1.6 is now available at:

  https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.6

Highlights include improvements in node attribute management
capabilities, status display, and resource description usage, as well
as a bunch of bug fixes and a new Python API.

Support for Nagios resources and for the moon phase in date/time-based
rules has been deprecated and will be removed in a future release.

Many thanks to all contributors of source code and language
translations to this release, including binlingyu, Chris Lumens, Fabio
Di Nitto, Gao,Yan, Grace Chin, Ken Gaillot, Klaus Wenninger, lihaipeng,
liupei, liutong, Reid Wahl, Tahlia Richardson, wanglujun, WangMengabc,
xuezhixin, and zhanghuanhuan. 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Hyperconverged 3 Node Cluster

2023-05-16 Thread Ken Gaillot

Hi,

The "Clusters from Scratch" document is a good place to start:

https://clusterlabs.org/pacemaker/doc/2.1/Clusters_from_Scratch/singlehtml/

It gives a walk-through of creating a two-node cluster with DRBD shared
storage and a web server, but the intent is simply to get familiar with
the cluster tools. Once you're comfortable with that, it should be
reasonably straightforward to do what you want.

On Tue, 2023-05-16 at 22:18 +1000, adam-clusterl...@blueraccoon.com.au
wrote:
> Hi ClusterLabs Community,
> 
> I am rebuilding my homelab and looking to further my learning with
> some 
> new technologies being KVM and pacemaker. Is anyone able to offer
> some 
> general guidance on how I can setup a hyperconverged 3 node cluster
> with 
> KVM without shared storage. Based on my research the best stack for
> my 
> use case appears to be;
> 
> * Rocky 9 - I prefer rpm based distributions
> 
> * KVM for the hypervisor
> 
> * ZFS -  To be able to snapshot my VM's regularly and replicate 
> snapshots to an external ZFS pool
> 
> * Gluster (on top of  ZFS) 3 node replica - Used to sync the storage 
> across all 3 nodes
> 
> 
> I assume the missing piece of the puzzle to enable high availability
> of 
> the virtual machines is pacemaker and corosync?
> 
> Will pacemaker be able to automatically move virtual machines based
> on 
> resource (CPU and Memory) across my 3 nodes?
> 
> How will I be able to migrate virtual machines to each of nodes?
> 
>  Preference would be to be able to use either Cockpit or the
> command 
> line based on what I am doing.
> 
> 
> I am also aware of oVirt that could achieve my goals but the project 
> doesnt seem all that active and I am looking to simplify my solution
> for 
> reliability and ability to troubleshoot.
> 
> 
> Appreciate any guidance to point me the right direction.
> 
> Thanks,
> 
> Adam
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.6-rc2 now available

2023-05-02 Thread Ken Gaillot

Hi all,

The second (and possibly final) release candidate for Pacemaker 2.1.6
is now available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.6-rc2

This release candidate officially deprecates the "moon" attribute of
date_spec elements in rules, and support for nagios-class resources. It
also includes a few bug fixes. For details, see the above link.

Everyone is encouraged to download, compile and test the new release.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all contributors of source code and language
translations to this release, including Chris Lumens, Gao,Yan, Ken
Gaillot, and Klaus Wenninger.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Best DRBD Setup

2023-04-26 Thread Ken Gaillot

On Wed, 2023-04-26 at 21:12 +0200, Brian Jenkins wrote:
> Hi all
> 
> I’ve been working on my home cluster setup for a while now and tried
> varying setups for DRBD resources and finally settled on one that I
> think is the best but I’m still not completely satisfied with the
> results. Biggest question is, did I set this up in the best way? Any
> advice would be appreciated. I have multiple servers setup in groups
> and the DRBDs are in separate clones like below. I added constraints
> to hopefully ensure things work together well. If you have any
> questions to clarify my setup let me know.
> 
> 
>* Resource Group: git-server:
> * gitea-mount   (ocf:heartbeat:Filesystem):  Started
> node1
> * git-ip(ocf:heartbeat:IPaddr2): Started node1
> * gitea (systemd:gitea): Started node1
> * backup-gitea  (systemd:backupgitea.timer): Started
> node1
>   * Resource Group: pihole-server:
> * pihole-mount  (ocf:heartbeat:Filesystem):  Started
> node2
> * pihole-ip (ocf:heartbeat:IPaddr2): Started node2
> * pihole-ftl(systemd:pihole-FTL):Started node2
> * pihole-web(systemd:lighttpd):  Started node2
> * pihole-cron   (ocf:heartbeat:symlink): Started
> node2
> * pihole-backup (systemd:backupDRBD@pihole.timer):  
> Started node2
>   * Clone Set: drbd-gitea-clone [drbd-gitea] (promotable):
> * Promoted: [ node1 ]
> * Unpromoted: [ node2 node3 node4 node5 ]
>   * Clone Set: drbd-pihole-clone [drbd-pihole] (promotable):
> * Promoted: [ node2 ]
> * Unpromoted: [ node1 node3 node4 node5 ]
> 
>  Ordering Constraints:
>   start drbd-gitea-clone then start gitea-mount (kind:Mandatory)
>   start drbd-pihole-clone then start pihole-mount (kind:Mandatory)

You want promote then start; with this, the mounts can start before
DRBD is promoted. It's best practice to refer to the group in
constraints rather than a member, but it shouldn't be a problem.

> 
>  Colocation Constraints:
>   pihole-server with drbd-pihole-clone (score:INFINITY) (rsc-
> role:Started) (with-rsc-role:Promoted)
>   git-server with drbd-gitea-clone (score:INFINITY) (rsc-
> role:Started) (with-rsc-role:Promoted)
> 
>  My setup is on five raspberry pis running ubuntu server 22.10 with:
> pacemaker 2.1.4-2ubuntu1
> pcs 0.11.3-1ubuntu1
> drbd 9.2.2-1ppa1~jammy1
> 
>  Overall the setup works but it seems quite fragile. I suffer from
> lots of fencing whenever I reboot a server and it doesn’t want to
> restart correctly. Another thing I have noticed is that it will
> sometimes take as long as 10-12 minutes to mount one of the DRBD
> filesystems (XFS) so I have extended the start timeout for each *-
> mount to 15 minutes.
> 
> Thanks in advance for any advice to improve the setup.
> 
> Brian
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] How to block/stop a resource from running twice?

2023-04-26 Thread Ken Gaillot

On Wed, 2023-04-26 at 17:05 +, fs3000 via Users wrote:
> --- Original Message ---
> On Monday, April 24th, 2023 at 10:08, Andrei Borzenkov <
> arvidj...@gmail.com> wrote:
> 
> 
> > On Mon, Apr 24, 2023 at 11:52 AM Klaus Wenninger 
> > kwenn...@redhat.com wrote:
> > 
> > > The checking for a running resource that isn't expected to be
> > > running isn't done periodically (at
> > > least not per default and I don't know a way to achieve that from
> > > the top of my mind).
> > 
> > op monitor role=Stopped interval=20s
> 
> Thanks a lot for the tip. It works, Not perfect, but that's fine.
> When it detects the service is also active on a second node, it stops
> the service on all nodes, and restarts the service on the first node.
> Would be better if only it stopped the service on the second none,
> leaving the service on the first node untouched. I understand this is
> due to the multiple-active setting however:
> 
> 
> What should the cluster do if it ever finds the resource active on
> more than one node? Allowed values: 
> 
> - block: mark the resource as unmanaged
> - stop_only: stop all active instances and leave them that way
> - stop_start: stop all active instances and start the resource in one
> location only
> 
> DEFAULT: stop_start
> 
> 
> From: 
> https://clusterlabs.org/pacemaker/doc/deprecated/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-resource-options.html

Since Pacemaker 2.1.4, multiple-active can be set to "stop_unexpected"
to do what you want.

It's not the default because some services may no longer operate
correctly if an extra instance was started on the same host, so it's on
the admin to be confident their services can handle it.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Corosync 3.1.5 Fails to Autostart

2023-04-24 Thread Ken Gaillot

Hi,

With Corosync 3, node names must be specified in
/etc/corosync/corosync.conf like:

node {
ring0_addr: node1
name: node1
nodeid: 1
}

(ring0_addr is a resolvable name used to identify the interface, and
name is the name that should be used in the cluster)

If you set up the cluster from scratch using pcs, it should do that for
you. I'm guessing you reused an older config, or manually set up
corosync.conf.

It shouldn't be necessary to change the After. If it still is an issue
after fixing the config, you might have some unusual dependency like a 
disk that gets mounted later, in which case it would be better to add
an After for the specific dependency.

On Mon, 2023-04-24 at 22:16 +0200, Tyler Phillippe via Users wrote:
> Hello all,
> 
> We are currently using RHEL9 and have set up a PCS cluster. When
> restarting the servers, we noticed Corosync 3.1.5 doesn't start
> properly with the below error message:
> 
> Parse error in config: No valid name found for local host
> Corosync Cluster Engine exiting with status 8 at main.c:1445.
> Corosync.service: Main process exited, code=exited, status=8/n/a
> 
> These are physical, blade machines that are using a 2x Fibre Channel
> NIC in a Mode 6 bond as their networking interface for the cluster;
> other than that, there is really nothing special about these
> machines. We have ensured the names of the machines exist in
> /etc/hosts and that they can resolve those names via the hosts file
> first. The strange thing is if we start Corosync manually after we
> can SSH into the machines, Corosync starts immediately and without
> issue. We did manage to get Corosync to autostart properly by
> modifying the service file and changing the After=network-
> online.target to After=multi-user.target. In doing this, at first,
> Pacemaker complains about mismatching dependencies in the service
> between Corosync and Pacemaker. Changing the Pacemaker service to
> After=multi-user.target fixes that self-caused issue. Any ideas on
> this one? Mostly checking to see if changing the After dependency
> will harm us in the future.
> 
> Thanks!
> 
> Respectfully,
>  Tyler Phillippe
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] FEEDBACK WANTED: possible deprecation of nagios-class resources

2023-04-19 Thread Ken Gaillot

Hi all,

I am considering deprecating Pacemaker's support for nagios-class
resources.

This has nothing to do with nagios monitoring of a Pacemaker cluster,
which would be unaffected. This is about Pacemaker's ability to use
nagios plugin scripts as a type of resource.

Nagios-class resources act as a sort of proxy monitor for another
resource. You configure the main resource (typically a VM or container)
as normal, then configure a nagios: resource to monitor it. If
the nagios monitor fails, the main resource is restarted.

Most of that use case is now better covered by Pacemaker Remote and
bundles. The only advantage of nagios-class resources these days is if
you have a VM or container image that you can't modify (to add
Pacemaker Remote). If we deprecate nagios-class resources, the
preferred alternative would be to write a custom OCF agent with the
monitoring you want (which can even call a nagios plugin).

Does anyone here use nagios-class resources? If it's actively being
used, I'm willing to keep it around. But if there's no demand, we'd
rather not have to maintain that (poorly tested) code forever.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-19 Thread Ken Gaillot

On Wed, 2023-04-19 at 08:00 +0200, lejeczek via Users wrote:
> 
> On 18/04/2023 21:02, Ken Gaillot wrote:
> > On Tue, 2023-04-18 at 19:36 +0200, lejeczek via Users wrote:
> > > On 18/04/2023 18:22, Ken Gaillot wrote:
> > > > On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:
> > > > > Hi guys.
> > > > > 
> > > > > When it's done by the cluster itself, eg. a node goes
> > > > > 'standby' -
> > > > > how
> > > > > do clusters migrate VirtualDomain resources?
> > > > 1. Call resource agent migrate_to action on original node
> > > > 2. Call resource agent migrate_from action on new node
> > > > 3. Call resource agent stop action on original node
> > > > 
> > > > > Do users have any control over it and if so then how?
> > > > The allow-migrate resource meta-attribute (true/false)
> > > > 
> > > > > I'd imagine there must be some docs - I failed to find
> > > > It's sort of scattered throughout Pacemaker Explained -- the
> > > > main
> > > > one
> > > > is:
> > > > 
> > > > https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources
> > > > 
> > > > > Especially in large deployments one obvious question would be
> > > > > -
> > > > > I'm
> > > > > guessing as my setup is rather SOHO - can VMs migrate in
> > > > > sequence
> > > > > or
> > > > > it is(always?) a kind of 'swarm' migration?
> > > > The migration-limit cluster property specifies how many live
> > > > migrations
> > > > may be initiated at once (the default of -1 means unlimited).
> > > But if this is cluster property - unless I got it wrong,
> > > hopefully - then this govern any/all resources.
> > > If so, can such a limit be rounded down to RA type or
> > > perhaps group of resources?
> > > 
> > > many thanks, L.
> > No, it's global
> To me it feels so intuitive, so natural & obvious that I 
> will ask - nobody yet suggested that such feature be 
> available to smaller divisions of cluster independently of 
> global rule?
> In the vastness of resource types many are polar opposites 
> and to treat them all the same?
> Would be great to have some way to tell cluster to run 
> different migration/relocation limits on for eg. 
> compute-heavy resources VS light-weight ones - where to 
> "file" such a enhancement suggestion, Bugzilla?
> 
> many thanks, L.

Looking at the code, I see it's a little different than I originally
thought.

First, I overlooked that it's correctly documented as a per-node limit
rather than a cluster-wide limit.

That highlights the complexity of allowing different values for
different resources; if rscA has a migration limit of 2, and rscB has a
migration limit of 5, do we allow up to 2 rscA migrations and 5 rscB
migrations simultaneously, or do we weight them relative to each other
so the total capacity is still constrained (for example limiting it to
1 rscA migration and 2 rscB migrations together)?

We would almost need something like the node utilization feature, being
able to define a node's total migration capacity and then how much of
that capacity is taken up by the migration of a specific resource. That
seems overcomplicated to me, especially since there aren't that many
resource types that support live migration.

Second, any actions on a Pacemaker Remote node count toward the
throttling limit of its connection host, and aren't checked for
migration-limit at all. That's an interesting design choice, and it's
not clear what the ideal would be. For a VM or container, it kind of
makes sense to count against the host's throttling. For a remote node,
not so much. And I'm guessing not checking migration-limit in this case
is an oversight.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Offtopic - role migration

2023-04-18 Thread Ken Gaillot

On Tue, 2023-04-18 at 19:50 +0200, Vladislav Bogdanov wrote:
> Btw, an interesting question. How much efforts would it take to
> support a migration of a Master role over the nodes? An use-case is
> drbd, configured for a multi-master mode internally, but with master-
> max=1 in the resource definition. Assuming that resource-agent
> supports that flow - 
> 1. Do nothing. 
> 2. Promote on a dest node. 
> 3. Demote on a source node.
> 
> Actually just wonder, because may be it could be some-how achievable
> to migrate VM which are on top of drbd which is not a multi-master in
> pacemaker. Fully theoretical case. Didn't verify the flow in-the-
> mind.
> 
> I believe that currently only the top-most resource is allowed to
> migrate, but may be there is some room for impovement?
> 
> Sorry for the off-topic.
> 
> Best
> Vlad

It would be worthwhile, but conceptually it's difficult to imagine a
solution.

If a resource must be colocated with the promoted role of another
resource, and only one instance can be promoted at a time, how would it
be possible to live-migrate? You couldn't promote the new instance
before demoting the old one, and you couldn't demote the old one
without stopping the dependent resource.

You would probably need some really complex new constraint types and/or
resource agent actions. Something like "colocate rsc1 with the promoted
role of rsc2-clone, unless it needs to migrate, in which case call this
special agent action to prepare it for running with a demoted instance,
then demote the instance, then migrate the resource, then promote the
new instance, then call this other agent action to return it to normal
operation".
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-18 Thread Ken Gaillot

On Tue, 2023-04-18 at 19:36 +0200, lejeczek via Users wrote:
> 
> On 18/04/2023 18:22, Ken Gaillot wrote:
> > On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:
> > > Hi guys.
> > > 
> > > When it's done by the cluster itself, eg. a node goes 'standby' -
> > > how
> > > do clusters migrate VirtualDomain resources?
> > 1. Call resource agent migrate_to action on original node
> > 2. Call resource agent migrate_from action on new node
> > 3. Call resource agent stop action on original node
> > 
> > > Do users have any control over it and if so then how?
> > The allow-migrate resource meta-attribute (true/false)
> > 
> > > I'd imagine there must be some docs - I failed to find
> > It's sort of scattered throughout Pacemaker Explained -- the main
> > one
> > is:
> > 
> > https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources
> > 
> > > Especially in large deployments one obvious question would be -
> > > I'm
> > > guessing as my setup is rather SOHO - can VMs migrate in sequence
> > > or
> > > it is(always?) a kind of 'swarm' migration?
> > The migration-limit cluster property specifies how many live
> > migrations
> > may be initiated at once (the default of -1 means unlimited).
> But if this is cluster property - unless I got it wrong, 
> hopefully - then this govern any/all resources.
> If so, can such a limit be rounded down to RA type or 
> perhaps group of resources?
> 
> many thanks, L.

No, it's global
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] manner in which cluster migrates VirtualDomain - ?

2023-04-18 Thread Ken Gaillot

On Tue, 2023-04-18 at 14:58 +0200, lejeczek via Users wrote:
> Hi guys.
> 
> When it's done by the cluster itself, eg. a node goes 'standby' - how
> do clusters migrate VirtualDomain resources?

1. Call resource agent migrate_to action on original node
2. Call resource agent migrate_from action on new node
3. Call resource agent stop action on original node

> Do users have any control over it and if so then how?

The allow-migrate resource meta-attribute (true/false)

> I'd imagine there must be some docs - I failed to find

It's sort of scattered throughout Pacemaker Explained -- the main one
is:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/advanced-options.html#migrating-resources

> Especially in large deployments one obvious question would be - I'm
> guessing as my setup is rather SOHO - can VMs migrate in sequence or
> it is(always?) a kind of 'swarm' migration?

The migration-limit cluster property specifies how many live migrations
may be initiated at once (the default of -1 means unlimited).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.1.6-rc1 now available

2023-04-17 Thread Ken Gaillot

Hi all,

The first release candidate for Pacemaker 2.1.6 is now available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.6-rc1

Highlights include new features for managing node attributes, and
easier use of resource descroptions.

Utilization node attributes may now be transient. attrd_updater now
supports a --wait parameter to return only once the new value is
effective locally or cluster-wide. Both crm_attribute and attrd_updater
support --pattern in more usages.

Resource descriptions may now be managed using the crm_resource --
element option, and will be displayed in crm_mon --show-detail output.

In addition, alerts and alert recipients may now be temporarily
disabled by setting the new "enabled" meta-attribute to false.

This release also includes a number of bug fixes. For details, see the
above link.

Everyone is encouraged to download, compile and test the new release.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all contributors of source code and language
translations to this release, including binlingyu, Chris Lumens, Fabio
M. Di Nitto, Gao,Yan, Grace Chin, Ken Gaillot, Klaus Wenninger,
lihaipeng, liupei, liutong, Reid Wahl, Tahlia Richardson, wanglujun,
WangMengabc, xuezhixin, and zhanghuanhuan.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] ClusterMon resource creation getting illegal option -- E in ClusterMon

2023-04-12 Thread Ken Gaillot

ClusterMon with -E has been superseded by Pacemaker's built-in alerts
functionality:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#document-alerts

On Wed, 2023-04-12 at 12:03 +, S Sathish S via Users wrote:
> Hi Team,
>  
> While creating ClusterMon resource agent in Clusterlab High
> Availability getting illegal option -- E in ClusterMon.
>  
> [root@node1 tmp]# pcs resource create SNMP_test
> ocf:pacemaker:ClusterMon  extra_options="-E /tmp/tools/PCSESA.sh"
> Error: Validation result from agent (use --force to override):
>   /usr/lib/ocf/resource.d/pacemaker/ClusterMon: illegal option -- E
>   Apr 12 13:36:47 ERROR: Invalid options -E /tmp/tools/PCSESA.sh!
> Error: Errors have occurred, therefore pcs is unable to continue
> [root@node1 tmp]#
>  
> As per above error we use --force option now resource is getting
> created but still we get this error in the system  , But ClusterMon
> resource functionality is working as expected . we need to understand
> any impact with below error / how to rectify illegal option on
> ClusterMon.
>  
> [root@node1 tmp]# pcs resource create SNMP_test
> ocf:pacemaker:ClusterMon  extra_options="-E /tmp/tools/PCSESA.sh" --
> force
> Warning: Validation result from agent:
>   /usr/lib/ocf/resource.d/pacemaker/ClusterMon: illegal option -- E
>   Apr 12 13:49:43 ERROR: Invalid options -E /tmp/tools/PCSESA.sh!
> [root@node1 tmp]#
>  
> Please find the Clusterlab RPM version used:
> pacemaker-cluster-libs-2.1.4-1.2.1.4.git.el8.x86_64
> resource-agents-4.11.0-1.el8.x86_64
> pacemaker-cli-2.1.4-1.2.1.4.git.el8.x86_64
> pcs-0.10.14-1.el8.x86_64
> corosynclib-3.1.7-1.el8.x86_64
> corosync-3.1.7-1.el8.x86_64
> pacemaker-2.1.4-1.2.1.4.git.el8.x86_64
> pacemaker-libs-2.1.4-1.2.1.4.git.el8.x86_64
> pacemaker-schemas-2.1.4-1.2.1.4.git.el8.noarch
>  
> Thanks and Regards,
> S Sathish S
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Location not working [FIXED]

2023-04-11 Thread Ken Gaillot

On Tue, 2023-04-11 at 17:31 +0300, Miro Igov wrote:
> I fixed the issue by changing location definition from:
>  
> location intranet-ip_on_any_nginx intranet-ip \
> rule -inf: opa-nginx_1_active eq 0 \
> rule -inf: opa-nginx_2_active eq 0
>  
> To:
>  
> location intranet-ip_on_any_nginx intranet-ip \
> rule opa-nginx_1_active eq 1 \
>rule opa-nginx_2_active eq 1
>  
> Now it works fine and shows the constraint with: crm res constraint
> intranet-ip

Ah, I suspect the issue was that the original constraint compared only
against 0, when initially (before the resources ever start) the
attribute is undefined.

Note that your new constraint says that the IP *prefers* to run where
the attribute is 1, but if there are no nodes with the attribute set to
1, it can still start somewhere. On the other hand, bans are mandatory,
so you may want to go back to that and just specify it as "ne 1".

>  
>  
>  
> From: Users  On Behalf Of Miro Igov
> Sent: 10 April 2023 14:19
> To: users@clusterlabs.org
> Subject: [ClusterLabs] Location not working
>  
> Hello,
> I have a resource with location constraint set to:
>  
> location intranet-ip_on_any_nginx intranet-ip \
> rule -inf: opa-nginx_1_active eq 0 \
> rule -inf: opa-nginx_2_active eq 0
>  
> In syslog I see the attribute transition:
> Apr 10 12:11:02 intranet-test2 pacemaker-attrd[1511]:  notice:
> Setting opa-nginx_1_active[intranet-test1]: 1 -> 0
>  
> Current cluster status is :
>  
> Node List:
>   * Online: [ intranet-test1 intranet-test2 nas-sync-test1 nas-sync-
> test2 ]
>  
> * stonith-sbd (stonith:external/sbd):  Started intranet-test2
>   * admin-ip(ocf::heartbeat:IPaddr2):Started nas-sync-
> test2
>   * cron_symlink(ocf::heartbeat:symlink):Started
> intranet-test1
>   * intranet-ip (ocf::heartbeat:IPaddr2):Started intranet-
> test1
>   * mysql_1 (systemd:mariadb@intranet-test1):Started
> intranet-test1
>   * mysql_2 (systemd:mariadb@intranet-test2):Started
> intranet-test2
>   * nginx_1 (systemd:nginx@intranet-test1):  Stopped
>   * nginx_1_active  (ocf::pacemaker:attribute):  Stopped
>   * nginx_2 (systemd:nginx@intranet-test2):  Started intranet-
> test2
>   * nginx_2_active  (ocf::pacemaker:attribute):  Started
> intranet-test2
>   * php_1   (systemd:php5.6-fpm@intranet-test1): Started
> intranet-test1
>   * php_2   (systemd:php5.6-fpm@intranet-test2): Started
> intranet-test2
>   * data_1  (ocf::heartbeat:Filesystem): Stopped
>   * data_2  (ocf::heartbeat:Filesystem): Started intranet-
> test2
>   * nfs_export_1(ocf::heartbeat:exportfs):   Stopped
>   * nfs_export_2(ocf::heartbeat:exportfs):   Started nas-
> sync-test2
>   * nfs_server_1(systemd:nfs-server@nas-sync-test1):
> Stopped
>   * nfs_server_2(systemd:nfs-server@nas-sync-test2):
> Started nas-sync-test2
>  
> Failed Resource Actions:
>   * nfs_server_1_start_0 on nas-sync-test1 'error' (1): call=95,
> status='complete', exitreason='', last-rc-change='2023-04-10 12:35:12
> +02:00', queued=0ms, exec=209ms
>  
>  
> Why intranet-ip is located on intranet-test1 while nginx_1_active is
> 0 ?
>  
> # crm res constraint intranet-ip
>
> cron_symlink 
> (score=INFINITY, id=c_cron_symlink_on_intranet-ip)
> * intranet-ip
>   : Node nas-sync-
> test2 
> (score=-INFINITY, id=intranet-ip_loc-rule)
>   : Node nas-sync-
> test1 
> (score=-INFINITY, id=intranet-ip_loc-rule)
>  
> Why no constraint entry for intranet-ip_on_any_nginx location ?
>  
>  
>  
> 
>  This message has been sent as a part of discussion between PHARMYA
> and the addressee whose name is specified above. Should you receive
> this message by mistake, we would be most grateful if you informed us
> that the message has been sent to you. In this case, we also ask that
> you delete this message from your mailbox, and do not forward it or
> any part of it to anyone else.
> Thank you for your cooperation and understanding. 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Location not working

2023-04-10 Thread Ken Gaillot

On Mon, 2023-04-10 at 16:33 +0300, Andrei Borzenkov wrote:
> On Mon, Apr 10, 2023 at 4:26 PM Ken Gaillot 
> wrote:
> > On Mon, 2023-04-10 at 14:18 +0300, Miro Igov wrote:
> > > Hello,
> > > I have a resource with location constraint set to:
> > > 
> > > location intranet-ip_on_any_nginx intranet-ip \
> > > rule -inf: opa-nginx_1_active eq 0 \
> > > rule -inf: opa-nginx_2_active eq 0
> > 
> > You haven't specified a score for the constraint, so it defaults to
> > 0,
> > meaning the resource is allowed on those nodes but has no
> > preference
> > for them.
> > 
> 
> But each rule has score -INFINITY?

Whoops, I skimmed too quickly, you're right.

The referenced node attributes (opa-nginx_1_active and opa-
nginx_1_active) aren't set anywhere, so the condition isn't met.

> 
>   
> 
>value="0" id="intranet-ip_on_any_nginx-rule-expression"/>
> 
> 
>value="0" id="intranet-ip_on_any_nginx-rule-0-expression"/>
> 
>   
> 
> This exactly matches the example in documentation where score is
> moved
> from rsc_location to rule.
> 
> > > In syslog I see the attribute transition:
> > > Apr 10 12:11:02 intranet-test2 pacemaker-attrd[1511]:  notice:
> > > Setting opa-nginx_1_active[intranet-test1]: 1 -> 0
> > > 
> > > Current cluster status is :
> > > 
> > > Node List:
> > >   * Online: [ intranet-test1 intranet-test2 nas-sync-test1 nas-
> > > sync-
> > > test2 ]
> > > 
> > > * stonith-sbd (stonith:external/sbd):  Started intranet-test2
> > >   * admin-ip(ocf::heartbeat:IPaddr2):Started nas-
> > > sync-
> > > test2
> > >   * cron_symlink(ocf::heartbeat:symlink):Started
> > > intranet-test1
> > >   * intranet-ip (ocf::heartbeat:IPaddr2):Started
> > > intranet-
> > > test1
> > >   * mysql_1 (systemd:mariadb@intranet-test1):Started
> > > intranet-test1
> > >   * mysql_2 (systemd:mariadb@intranet-test2):Started
> > > intranet-test2
> > >   * nginx_1 (systemd:nginx@intranet-test1):  Stopped
> > >   * nginx_1_active  (ocf::pacemaker:attribute):  Stopped
> > >   * nginx_2 (systemd:nginx@intranet-test2):  Started
> > > intranet-
> > > test2
> > >   * nginx_2_active  (ocf::pacemaker:attribute):  Started
> > > intranet-test2
> > >   * php_1   (systemd:php5.6-fpm@intranet-test1): Started
> > > intranet-test1
> > >   * php_2   (systemd:php5.6-fpm@intranet-test2): Started
> > > intranet-test2
> > >   * data_1  (ocf::heartbeat:Filesystem): Stopped
> > >   * data_2  (ocf::heartbeat:Filesystem): Started
> > > intranet-
> > > test2
> > >   * nfs_export_1(ocf::heartbeat:exportfs):   Stopped
> > >   * nfs_export_2(ocf::heartbeat:exportfs):   Started
> > > nas-
> > > sync-test2
> > >   * nfs_server_1(systemd:nfs-server@nas-sync-test1):
> > > Stopped
> > >   * nfs_server_2(systemd:nfs-server@nas-sync-test2):
> > > Started nas-sync-test2
> > > 
> > > Failed Resource Actions:
> > >   * nfs_server_1_start_0 on nas-sync-test1 'error' (1): call=95,
> > > status='complete', exitreason='', last-rc-change='2023-04-10
> > > 12:35:12
> > > +02:00', queued=0ms, exec=209ms
> > > 
> > > 
> > > Why intranet-ip is located on intranet-test1 while nginx_1_active
> > > is
> > > 0 ?
> > > 
> > > # crm res constraint intranet-ip
> > > 
> > > cron_symlink
> > > (score=INFINITY, id=c_cron_symlink_on_intranet-ip)
> > > * intranet-ip
> > >   : Node nas-sync-
> > > test2
> > > (score=-INFINITY, id=intranet-ip_loc-rule)
> > >   : Node nas-sync-
> > > test1
> > > (score=-INFINITY, id=intranet-ip_loc-rule)
> > > 
> > > Why no constraint entry for intranet-ip_on_any_nginx location ?
> > > 
> > > 
> > >  This message has been sent as a part of discussion between
> > > PHARMYA
> > > and the addressee whose name is specified above. Should you
> > > receive
> > > this message by mistake, we would be most grateful if you
> > > informed us
> > > that the message has been sent to you. In this case, we also ask
> >

Re: [ClusterLabs] Location not working

2023-04-10 Thread Ken Gaillot

On Mon, 2023-04-10 at 14:18 +0300, Miro Igov wrote:
> Hello,
> I have a resource with location constraint set to:
>  
> location intranet-ip_on_any_nginx intranet-ip \
> rule -inf: opa-nginx_1_active eq 0 \
> rule -inf: opa-nginx_2_active eq 0

You haven't specified a score for the constraint, so it defaults to 0,
meaning the resource is allowed on those nodes but has no preference
for them.

>  
> In syslog I see the attribute transition:
> Apr 10 12:11:02 intranet-test2 pacemaker-attrd[1511]:  notice:
> Setting opa-nginx_1_active[intranet-test1]: 1 -> 0
>  
> Current cluster status is :
>  
> Node List:
>   * Online: [ intranet-test1 intranet-test2 nas-sync-test1 nas-sync-
> test2 ]
>  
> * stonith-sbd (stonith:external/sbd):  Started intranet-test2
>   * admin-ip(ocf::heartbeat:IPaddr2):Started nas-sync-
> test2
>   * cron_symlink(ocf::heartbeat:symlink):Started
> intranet-test1
>   * intranet-ip (ocf::heartbeat:IPaddr2):Started intranet-
> test1
>   * mysql_1 (systemd:mariadb@intranet-test1):Started
> intranet-test1
>   * mysql_2 (systemd:mariadb@intranet-test2):Started
> intranet-test2
>   * nginx_1 (systemd:nginx@intranet-test1):  Stopped
>   * nginx_1_active  (ocf::pacemaker:attribute):  Stopped
>   * nginx_2 (systemd:nginx@intranet-test2):  Started intranet-
> test2
>   * nginx_2_active  (ocf::pacemaker:attribute):  Started
> intranet-test2
>   * php_1   (systemd:php5.6-fpm@intranet-test1): Started
> intranet-test1
>   * php_2   (systemd:php5.6-fpm@intranet-test2): Started
> intranet-test2
>   * data_1  (ocf::heartbeat:Filesystem): Stopped
>   * data_2  (ocf::heartbeat:Filesystem): Started intranet-
> test2
>   * nfs_export_1(ocf::heartbeat:exportfs):   Stopped
>   * nfs_export_2(ocf::heartbeat:exportfs):   Started nas-
> sync-test2
>   * nfs_server_1(systemd:nfs-server@nas-sync-test1):
> Stopped
>   * nfs_server_2(systemd:nfs-server@nas-sync-test2):
> Started nas-sync-test2
>  
> Failed Resource Actions:
>   * nfs_server_1_start_0 on nas-sync-test1 'error' (1): call=95,
> status='complete', exitreason='', last-rc-change='2023-04-10 12:35:12
> +02:00', queued=0ms, exec=209ms
>  
>  
> Why intranet-ip is located on intranet-test1 while nginx_1_active is
> 0 ?
>  
> # crm res constraint intranet-ip
>
> cron_symlink 
> (score=INFINITY, id=c_cron_symlink_on_intranet-ip)
> * intranet-ip
>   : Node nas-sync-
> test2 
> (score=-INFINITY, id=intranet-ip_loc-rule)
>   : Node nas-sync-
> test1 
> (score=-INFINITY, id=intranet-ip_loc-rule)
>  
> Why no constraint entry for intranet-ip_on_any_nginx location ?
>  
> 
>  This message has been sent as a part of discussion between PHARMYA
> and the addressee whose name is specified above. Should you receive
> this message by mistake, we would be most grateful if you informed us
> that the message has been sent to you. In this case, we also ask that
> you delete this message from your mailbox, and do not forward it or
> any part of it to anyone else.
> Thank you for your cooperation and understanding. 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1686 matches

Mail list logo