Re: [ClusterLabs] Antw: Antw: Re: Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-13 Thread Ken Gaillot
t; > 13.08.2019 > > um > 10:07 in Nachricht <5d526fb002a100032...@gwsmtp.uni-regensburg.de > >: > > > > > Ken Gaillot schrieb am 13.08.2019 um > > > > > 01:03 in > > > > Nachricht > > : > > > On Mon, 2019‑08‑12 at 17

Re: [ClusterLabs] Querying failed rersource operations from the CIB

2019-08-12 Thread Ken Gaillot
onder where to look for the reason) > BTW: "lrm_resources" is not documented, and the structure seemes to > change. Can I restrict the output to LRM data? One possibility is to run crm_mon with --as-xml and parse the failed actions from that output. The schema is

Re: [ClusterLabs] why is node fenced ?

2019-08-12 Thread Ken Gaillot
5:47 crm node online ha-idg-2 > 17:26:35 crm node standby ha-idg1- > 17:30:21 zypper up (install Updates on ha-idg-1) > 17:37:32 systemctl reboot > 17:43:04 systemctl start pacemaker.service > 17:44:00 ha-idg-1 is fenced >

Re: [ClusterLabs] Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-12 Thread Ken Gaillot
resources are closely related, so changing the status of one member might affect what status the others report. > Regards, > Ulrich -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Master/slave failover does not work as expected

2019-08-12 Thread Ken Gaillot
default but isn't for historical reasons. It's a good idea to always use -- lifetime=reboot. You could double-check with "cibadmin -Q|grep master-" and see if there is more than one entry per node. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] corosync.service (and sbd.service) are not stopper on pacemaker shutdown when corosync-qdevice is used

2019-08-09 Thread Ken Gaillot
: > > > > > > > > crm cluster start > > > > crm cluster stop > > > > > > > > Cheers, > > > > Roger > > > > > > > > > Unfortunately, corosync-qdevice.service declares > > > > > Requires=corosync.service and corosync-qdevice.service itself > > > > > is *not* > > > > > stopped when pacemaker.service is stopped. Which means > > > > > corosync.service > > > > > remains "needed" and is never stopped. > > > > > > > > > > Also sbd.service (which is PartOf=corosync.service) remains > > > > > running > > > > > as well. > > > > > > > > > > The latter is really bad, as it means sbd watchdog can kick > > > > > in at any > > > > > time when user believes cluster stack is safely stopped. In > > > > > particular > > > > > if qnetd is not accessible (think network reconfiguration). -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: how to connect to the cluster from a docker container

2019-08-07 Thread Ken Gaillot
new avenues for trouble, e.g. the user may be able to disable the hawk resource via hawk, but couldn't enable it again without resorting to the command line. Whatever approach you go with, in your case it's important to keep the pacemaker version inside the container the same or newer than the rest of the cluster. That's because it will need the schema files to validate the cluster configuration. (This isn't important for most containers, since they don't run any configuration commands.) > > Cheers, > > Dejan > > > Klaus > > > > > > > > Cheers, > > > > > > > > > > Dejan -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] how to connect to the cluster from a docker container

2019-08-06 Thread Ken Gaillot
know for sure I haven't ever tried this in practice, some > > one else here could have. Also, there may be a lot of fun with > > various > > Linux Security Modules like SELinux. > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] "resource cleanup" - but error message does not dissapear

2019-07-30 Thread Ken Gaillot
ed !?! > > Any idea ? > > > Bernd There was a regression in 1.1.20 and 2.0.0 (fixed in the next versions) where cleanups of multiple errors would miss some of them. Any chance you're using one of those? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-30 Thread Ken Gaillot
le your own, I'd go with the latest 1.1 or 2.0 version (currently 1.1.21 or 2.0.2). > > -Original Message- > From: Ken Gaillot > Sent: Tuesday, July 30, 2019 12:53 AM > To: Somanath Jeeva ; Cluster Labs - All > topics related to open-source clustering welcomed < > user

Re: [ClusterLabs] Strange behaviour of group resource

2019-07-30 Thread Ken Gaillot
> > Thanks & Regards > > Dileep Nair > Squad Lead - SAP Base > Togaf Certified Enterprise Architect > IBM Services for Managed Applications > +91 98450 22258 Mobile > dilen...@in.ibm.com > > IBM Services > > > Ken Gaillot ---07/30/2019 12:47:52 AM---

Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-29 Thread Ken Gaillot
> > > Somanath Thilak J > > > > > > > > > > > > _______________ > > > > Manage your subscription: > > > > > > https://protect2.fireeye.com/url?k=28466b53-74926310-28462bc8-86a1150 > > &

Re: [ClusterLabs] Strange behaviour of group resource

2019-07-29 Thread Ken Gaillot
IBM Services for Managed Applications > +91 98450 22258 Mobile > dilen...@in.ibm.com > > IBM Services -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Reusing resource set in multiple constraints

2019-07-29 Thread Ken Gaillot
s://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#_tagging_configuration_elements -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-25 Thread Ken Gaillot
bc8-86a1150 > > > b > > > c3ba- > > > bb674f3a9b557cbd=1=https%3A%2F%2Flists.clusterlabs.org%2Fmai > > > l > > > man%2Flistinfo%2Fusers > > > > > > ClusterLabs home: > > > https://protect2.fireeye.com/url?k=4c5edd73-1

[ClusterLabs] Feedback wanted: Node reaction to fabric fencing

2019-07-24 Thread Ken Gaillot
(reboot) 2. Default to the current behavior (stop) 3. Default to the current behavior for now, and change it to the correct behavior whenever pacemaker 2.1 is released (probably a few years from now) -- Ken Gaillot ___ Manage your subscription: https

Re: [ClusterLabs] 2 nodes split brain with token timeout

2019-07-23 Thread Ken Gaillot
> totem.cluster_name (str) = FRPLZABPXY > totem.interface.0.bindnetaddr (str) = 10.XX.YY.2 > totem.interface.0.broadcast (str) = yes > totem.interface.0.mcastport (u16) = 5405 > totem.secauth (str) = off > totem.totem (str) = 4000 > totem.transport (str) = udpu > to

Re: [ClusterLabs] Which shell definitions to include?

2019-07-23 Thread Ken Gaillot
ges systemd requires, but systemd > complains about not being able to unmount /var while /var/run or > /var/lock is being used... Agreed, it should be a build-time option whether to use /run or /var/run -- Ken Gaillot ___ Manage your sub

Re: [ClusterLabs] On the semantics of ocf_exit_reason()

2019-07-23 Thread Ken Gaillot
R_ARGS > elif [ "X${dest_tsap//[^-A-Za-z0-9._\\/]/}" != "X${dest_tsap}" ]; > then > ocf_log err "$me: invalid value $dest_tsap for \"dest\"" > result=$OCF_ERR_ARGS > fi > ... > > Regards, > Ulrich > > > __

Re: [ClusterLabs] Antw: Re: Antw: Re: Resource won't start, crm_resource -Y does not help

2019-07-23 Thread Ken Gaillot
On Tue, 2019-07-23 at 07:57 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 22.07.2019 um > > > > 18:14 in Nachricht > > : > > On Mon, 2019-07-22 at 15:45 +0200, Ulrich Windl wrote: > > > Hi! > > > > > > My RA actually

Re: [ClusterLabs] [ClusterLabs Developers] pacemaker geo redundancy - 2 nodes

2019-07-23 Thread Ken Gaillot
> > > 2) Any disadvantages of going this way? > > > > > > > > > Thanks, > > > Rohit > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/developers > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Resource won't start, crm_resource -Y does not help

2019-07-22 Thread Ken Gaillot
t; status=complete, exitreason='', last‑rc‑change='Mon Jul 22 13:06:45 > > 2019', > > queued=0ms, exec=27ms > > > > > > Unfortunately the last resource cleanup was significantly_after_ > > > the time > > > > logged above, and it seems the CRM did not even tre‑try to start > > the RA. > > > > > > Could this be a bug in SLES12 SP4 > > > > (pacemaker‑1.1.19+20181105.ccd6b5b10‑3.10.1.x86_64)? > > > > > > Regards, > > > Ulrich -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] pacemaker alerts list

2019-07-17 Thread Ken Gaillot
alert types. The criticality would be derived from CRM_alert_desc, CRM_alert_rc, and CRM_alert_status. > > > > Thank you, > > Vlad > Equipment Management (EM) System Engineer -- Ken Gaillot ___ Manage your subscripti

Re: [ClusterLabs] Antw: Interacting with Pacemaker from my code

2019-07-16 Thread Ken Gaillot
es. Your agent has to set a master score for each node, and the node with the highest master score will be chosen as master. Your agent uses the crm_master command to do this (see its man page for details). Good luck, and let us know how it goes. > > > > > > Please let me know if you need any clarification or any other > > information. > > > > > > Thanks in advance !!! -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] resource location preference vs utilization

2019-07-16 Thread Ken Gaillot
hat. Have you tried putting small negative location constraints for the non-picky resources on those nodes? Also, you could try putting a smaller stickiness on the non-picky resources (I don't know whether that will have any effect, just trying to think of ideas). -- Ken Gaillot

Re: [ClusterLabs] pacemaker geo redundancy - 2 nodes

2019-07-15 Thread Ken Gaillot
nces of split brain depend on your workload. Something like a database or cluster filesystem could become horribly corrupted. > > > Thanks, > Rohit -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-12 Thread Ken Gaillot
On Fri, 2019-07-12 at 13:33 +0100, lejeczek wrote: > On 11/07/2019 14:16, Ken Gaillot wrote: > > On Thu, 2019-07-11 at 10:39 +0100, lejeczek wrote: > > > On 10/07/2019 15:50, Ken Gaillot wrote: > > > > On Wed, 2019-07-10 at 11:26 +0100, lejeczek wrote: >

Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-11 Thread Ken Gaillot
On Thu, 2019-07-11 at 10:39 +0100, lejeczek wrote: > On 10/07/2019 15:50, Ken Gaillot wrote: > > On Wed, 2019-07-10 at 11:26 +0100, lejeczek wrote: > > > hi guys, possibly @devel if they pop in here. > > > > > > is there, will there be, a way to make clus

Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-10 Thread Ken Gaillot
the entire node failed, then fencing comes into play. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] colocation - but do not stop resources on failure

2019-07-10 Thread Ken Gaillot
On Wed, 2019-07-10 at 10:30 +0100, lejeczek wrote: > On 09/07/2019 20:26, Ken Gaillot wrote: > > On Tue, 2019-07-09 at 11:21 +0100, lejeczek wrote: > > > hi guys, > > > > > > how to, if possible, create colocation which would not stop > > > depen

Re: [ClusterLabs] I have a question.

2019-07-09 Thread Ken Gaillot
ould make it more obvious that the group is for discussion by users and not just announcements. If there's any way I can help, let me know! Hopefully we can get more feedback from others. Does anyone know of other local user groups, or have opinions about names? -- Ken Gaillot _

Re: [ClusterLabs] "node is unclean" leads to gratuitous reboot

2019-07-09 Thread Ken Gaillot
.1.1 > mcastport: 5405 > ttl: 1 > } > } > > logging { > fileline: off > to_stderr: no > to_logfile: yes > logfile: /var/log/cluster/corosync.log > to_syslog: yes > debug: on > timestamp: on > logger_subsys { > subsys: QUORUM > debug: on > } > } > > nodelist { > node { > ring0_addr: mgraid-16201289RN00023-0 > nodeid: 1 > } > > node { > ring0_addr: mgraid-16201289RN00023-1 > nodeid: 2 > } > } > > quorum { > provider: corosync_votequorum > > two_node: 1 > > wait_for_all: 0 > } > > I’d appreciate any insight you can offer into this behavior, and any > suggestions you may have. > > Regards, > Michael > > > Michael Powell > Sr. Staff Engineer > > 15220 NW Greenbrier Pkwy > Suite 290 > Beaverton, OR 97006 > T 503-372-7327M 503-789-3019 H 503-625-5332 > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] colocation - but do not stop resources on failure

2019-07-09 Thread Ken Gaillot
finite score. Colocation is mandatory if the score is INFINITY (or -INFINITY for anti-colocation), otherwise it's a preference rather than a requirement. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users

Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-07-05 Thread Ken Gaillot
igration > > should not be considered dangling. > > > > A couple of side notes on your configuration: > > > > Instead of putting action=off in fence device configurations, you > > should use pcmk_reboot_action=off. Pacemaker adds action when > >

Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-07-03 Thread Ken Gaillot
: info: > update_cib_stonith_devices_v2:Updating device list from the cib: > delete lrm_resource[@id='vm_mouseidgenes'] > Jun 19 14:57:32 [9578] ha-idg-1 stonith-ng: info: > cib_devices_update: Updating devices to version 2.7007.891 > Jun 19 14:57:32 [9577] ha-idg-1cib: info: > cib_perform_op: Diff: --- 2.7

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-02 Thread Ken Gaillot
ondary > > Clone Set: ms_king_resource [king_resource] (promotable) > > Masters: [ secondary ] > > Slaves: [ primary ] > > Clone Set: ms_servant1 [servant1] > > Started: [ primary secondary ] > > Clone Set: ms_servant2 [servant2] (promotable) > > Masters: [ secondary ] > > Slaves: [ primary ] > > Clone Set: ms_servant3 [servant3] (promotable) > > Masters: [ secondary ] > > Slaves: [ primary ] > > servant4(lsb:servant4): Started secondary > > servant5 (lsb:servant5):Started secondary > > servant6 (lsb:servant6):Started secondary > > servant7 (lsb:servant7): Started secondary > > servant8 (lsb:servant8):Started secondary > > Resource Group: servant9_active_disabled > > servant9_resource1 (lsb:servant9_resource1):Started > > secondary > > servant9_resource2 (lsb:servant9_resource2): Started > > secondary > > servant10 (lsb:servant10): Started secondary > > servant11 (lsb:servant11): Started secondary > > servant12(lsb:servant12): Started secondary > > servant13(lsb:servant13): Started secondary > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Ken Gaillot
* Recoverking_resource:1 ( Master primary ) > > >> * Pseudo action: ms_king_resource_pre_notify_demote_0 > > >> * Resource action: king_resource notify on secondary > > >> * Resource action: king_re

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Ken Gaillot
ry ] > servant4(lsb:servant4): Started secondary > servant5 (lsb:servant5):Started secondary > servant6 (lsb:servant6):Started secondary > servant7 (lsb:servant7): Started secondary > servant8 (lsb:servant8):Started secondary > Resourc

Re: [ClusterLabs] Problems with master/slave failovers

2019-06-28 Thread Ken Gaillot
otify on primary > > > * Pseudo action: ms_king_resource_confirmed-pre_notify_stop_0 > > > * Pseudo action: ms_king_resource_stop_0 > > > * Resource action: king_resource stop on primary > > > * Pseudo action: ms_king_resource_stopped_

Re: [ClusterLabs] iSCSI Target resource starts on both nodes -- despite my colocation constraint

2019-06-25 Thread Ken Gaillot
On Mon, 2019-06-24 at 14:45 -0500, Bryan K. Walton wrote: > On Mon, Jun 24, 2019 at 12:02:59PM -0500, Ken Gaillot wrote: > > > Jun 20 11:48:36 storage1 crmd[240695]: notice: Transition 1 > > > (Complete=12, Pending=0, Fired=0, Skipped=0, Incomplete=0, > > > Source

Re: [ClusterLabs] Two node cluster goes into split brain scenario during CPU intensive tasks

2019-06-25 Thread Ken Gaillot
acting to events until the end of the time. > > With Regards > Somanath Thilak J > > -Original Message- > From: Ken Gaillot > Sent: Monday, June 24, 2019 20:28 > To: Cluster Labs - All topics related to open-source clustering > welcomed ; Somanath Jeev

Re: [ClusterLabs] iSCSI Target resource starts on both nodes -- despite my colocation constraint

2019-06-24 Thread Ken Gaillot
online: > > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for ISCSIMillipedeIP on storage2: 7 (not running) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for ISCSICentipedeIP on storage2: 7 (not running) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for targetRHEVM on storage2: 0 (ok) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for targetVMStorage on storage2: 0 (ok) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for lunRHEVM on storage2: 7 (not running) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for lunVMStorage on storage2: 7 (not running) > > > -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Two node cluster goes into split brain scenario during CPU intensive tasks

2019-06-24 Thread Ken Gaillot
> > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] crmsh: Release 4.1.0

2019-06-21 Thread Ken Gaillot
gt; > > Diego Akechi > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-06-14 Thread Ken Gaillot
that time, if you need information > just let me know. Yes the logs and pe-input files would be helpful. It sounds like a bug in the scheduler. What version of pacemaker are you running? > > Thanks. > > > Bernd -- Ken Gaillot ___

Re: [ClusterLabs] FW: Fence agent definition under Centos7.6

2019-06-13 Thread Ken Gaillot
tch”. > > Specifically, I need to know where to put mgpstonith on the target > system(s). Generally, I’d appreciate a pointer to any > documentation/specification relevant to writing code for a fence > agent. > > Thanks, > Michael > ___ > Manage you

[ClusterLabs] Minor regression in pacemaker 2.0.2

2019-06-12 Thread Ken Gaillot
; (which is fixed with this PR, as a beneficial side effect). -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] cluster move resources back despite of stickiness ?

2019-06-07 Thread Ken Gaillot
ban/move commands creates permanent constraints that have to be cleared when you no longer want them. > > How do I ensure the cluster gets that the cost of behaving as above > is > too high. > > many thanks, L. -- Ken Gaillot ___ Man

Re: [ClusterLabs] VirtualDomain and SELinux

2019-06-07 Thread Ken Gaillot
ged to manage SELinux in such a way where it all works? > > I'm on Centos 7.6 with selinux-policy-3.13.1-229.el7_6.12.noarch > > many thanks, L. Pacemaker processes run in the cluster_t context and so are subject to those policies. I'm not sure what all is available under that, but ma

Re: [ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-07 Thread Ken Gaillot
On Fri, 2019-06-07 at 16:19 +, Hayden,Robert wrote: > Thanks > Robert > > Robert Hayden | Sr. Technology Architect | Cerner Corporation | > > > -Original Message- > > From: Users On Behalf Of Ken > > Gaillot > > Sent: Thursday, June 6, 2019 5:

Re: [ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-07 Thread Ken Gaillot
On Fri, 2019-06-07 at 10:15 +0100, lejeczek wrote: > On 06/06/2019 23:34, Ken Gaillot wrote: > > Hi all, > > > > It has been discovered that newer versions of selinux-policy > > prevent > > bundles in pacemaker 2.0 from logging. I have a straightforward > &

Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-07 Thread Ken Gaillot
ve another node that > could become the DC while restarting pacemaker? If I do add another > node then the problem doesn't seem to appear. Yes, that makes sense. > > Dirk > > On Wed, Jun 5, 2019 at 3:17 PM Ken Gaillot > wrote: > > On Wed, 2019-06-05 at 13:28 -0700, Dirk

[ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-06 Thread Ken Gaillot
'm leaning to the in-code solution, but I want to ask if anyone thinks the bundle restarts on upgrade are a deal-breaker for a minor-minor release, and would prefer the packaged policy solution. -- Ken Gaillot ___ Manage your subscription: https://lists.clust

Re: [ClusterLabs] Pacemaker 2.0.2 final release now available

2019-06-06 Thread Ken Gaillot
On Thu, 2019-06-06 at 10:12 -0500, Ken Gaillot wrote: > While I appreciate brevity, this was my e-mail client eating a draft. :-/ Source code for the Pacemaker 2.0.2 and 1.1.21 releases is now available: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.2 https://github.

[ClusterLabs] Pacemaker 2.0.2 final release now available

2019-06-06 Thread Ken Gaillot
___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-05 Thread Ken Gaillot
pacemaker/pengine/pe- > input-78.bz2 > > So it looks like to me that the cluster is demoting ms_MariaDB from > Master to Slave. I'm not sure if I should have waited for something > else to occur? > > I have attached pe-input-76.bz2. > > Dirk > > On Wed, Jun

Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-05 Thread Ken Gaillot
u could automate it with a fixed sleep or maybe a brief sleep plus crm_resource --wait. > Corosync 2.3.5-3ubuntu2.3 and Pacemaker 1.1.14-2ubuntu1.6 > > Sincerely, > Dirk > -- > Dirk Gassen > Senior Software Engineer | GetWellNetwork > o: 240.482.3146 > e: dgas...@getwellnetwork.com > To help people take an active role in their health journey -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Fence agent definition under Centos7.6

2019-05-31 Thread Ken Gaillot
ny information on how to > create one “from scratch”. > > Specifically, I need to know where to put mgpstonith on the target > system(s). Generally, I’d appreciate a pointer to any > documentation/specification relevant to writing code for a fence > agent. > >

Re: [ClusterLabs] Pacemaker not reacting as I would expect when two resources fail at the same time

2019-05-31 Thread Ken Gaillot
d"/> You want first-action="promote" in the above constraint, otherwise the slave being started (or the master being started but not yet promoted) is sufficient to start snmp_active_disabled (even though the colocation ensures it will only be started on the same node where t

Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-30 Thread Ken Gaillot
On Wed, 2019-05-29 at 09:23 -0500, Ken Gaillot wrote: > On Wed, 2019-05-29 at 16:53 +0900, 飯田雄介 wrote: > > Hi Ken and Jan, > > > > Thank you for your comment. > > > > I understand that solusion is to set PCMK_logfile in the sysconfig > > file. > > &

[ClusterLabs] Pacemaker 2.0.2-rc3 now available

2019-05-30 Thread Ken Gaillot
and appreciated. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] info: mcp_cpg_deliver: Ignoring process list sent by peer for local node

2019-05-29 Thread Ken Gaillot
617]: notice: Quorum > acquired > May 29 17:21:45 rider.private pacemakerd[51617]: notice: Node > whale.private state is now member > May 29 17:21:45 rider.private pacemakerd[51617]: notice: Node > swir.private state is now member > May 29 17:21:45 rider.private pacem

Re: [ClusterLabs] VirtualDomain and Resource_is_Too_Active ?? - problem/error

2019-05-29 Thread Ken Gaillot
peration stop failed 'not configured' (6) >error: unpack_rsc_op: Preventing HA-work9-win10-kvm from > re-starting anywhere: operation stop failed 'not configured' (6) >error: unpack_rsc_op:Preventing HA-work9-win10-kvm from > re-starting anywhere: operation stop failed 'not configured' (6) >error: unpack_rsc_op:Preventing HA-work9-win10-kvm from > re-starting anywhere: operation stop failed 'not configured' (6) >error: unpack_rsc_op:Preventing HA-work9-win10-kvm from > re-starting anywhere: operation stop failed 'not configured' (6) >error: native_create_actions:Resource HA-work9-win10-kvm is > active on 3 nodes (attempting recovery) > > Something buggy there, or I'm missing something obvious? > > many thanks, L. > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-29 Thread Ken Gaillot
respective sysconfig/default/conf.d file > > for > > pacemaker) will trigger export of HA_LOGFILE environment variable > > propagated subsequently towards the agent processes, and everything > > then works as expected. IOW. OCF and/or resource-agents are still > > reasonably decoupled, thankfully. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-28 Thread Ken Gaillot
rce-agents library must look for PCMK_logfile as well as HA_logfile. In that case, the easiest solution will be for us to set PCMK_logfile explicitly in the shipped sysconfig file. I can squeeze that into the soon-to-be-released 2.0.2 since it's not a code change. > > Regards, > Yusuke -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] FYI to anyone backporting the recent security fixes

2019-05-24 Thread Ken Gaillot
: improve CPG membership messages d0c12d98e01bc6228fc254456927d79a46554448 Fix: libcrmcommon: avoid use-of-NULL when checking whether process is active c0e1cf579f57922cbe872d23edf144dd2206156b Low: libcrmcommon: return proper code if testing pid is denied -- Ken Gaillot

Re: [ClusterLabs] drbd could not start by pacemaker. strange limited root privileges?

2019-05-23 Thread Ken Gaillot
/home/opc/tmp/modprobe2.trace > # ls -l /home/opc/tmp/modprobe2.trace > -rw-r--r--. 1 root root 0 May 21 15:44 /home/opc/tmp/modprobe2.trace > > > Thanks: lados. > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.0.2-rc2 now available

2019-05-21 Thread Ken Gaillot
cover all possible use cases, so your feedback is important and appreciated. A 1.1.21-rc1 with selected backports from the 2.0.2 release candidates will also be released soon. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org

Re: [ClusterLabs] Antw: Re: Constant stop/start of resource in spite of interval=0

2019-05-20 Thread Ken Gaillot
On Mon, 2019-05-20 at 23:15 +0200, Kadlecsik József wrote: > Hi, > > On Mon, 20 May 2019, Ken Gaillot wrote: > > > On Mon, 2019-05-20 at 15:29 +0200, Ulrich Windl wrote: > > > What worries me is "Rejecting name for unique". > > > > Trace me

Re: [ClusterLabs] Antw: Re: Constant stop/start of resource in spite of interval=0

2019-05-20 Thread Ken Gaillot
: Inverting > > > name match > > > > for private xml > > > trace May 18 23:02:49 build_parameter_list(632):0: Adding attr > > > > name=eduroam IPv4 tunnel to the xml result > > > trace May 18 23:02:49 build_parameter_list(621):0: Rejecting id > > > for > > > > private > > > trace May 18 23:02:49 build_parameter_list(625):0: Inverting id > > > match for > > private xml > > > trace May 18 23:02:49 build_parameter_list(632):0: Adding attr > > > id=0 to > > the > > xml result > > > > > > > > > By the way, it's debian stretch with pacemaker 1.1.16-1. > > > > I have double and triple checked the agent and it seems just a > > normal, > > working agent. > > > > The agent accepts the reload operation, it is advertised in the > > actions > > section of its metadata, there are parameters with unique set to 0 > > and > > still stop/start is called instead of reload. (I could even live > > with > > reload instead of start/stop in every 15 mins). > > > > As a desperate attempt, I deleted the resource and re-added and it > > of > > course did not help. > > > > I also created the attached trace file during creating the resource > > in the > > hope that it could help find the reason of the permanent > > stop/start. > > > > Best regards, > > Jozsef > > -- > > E-mail : kadlecsik.joz...@wigner.mta.hu > > PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt > > Address: Wigner Research Centre for Physics, Hungarian Academy of > > Sciences > > H-1525 Budapest 114, POB. 49, Hungary > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-05-16 Thread Ken Gaillot
On Thu, 2019-05-16 at 10:20 +0200, Jehan-Guillaume de Rorthais wrote: > On Wed, 15 May 2019 16:53:48 -0500 > Ken Gaillot wrote: > > > On Wed, 2019-05-15 at 11:50 +0200, Jehan-Guillaume de Rorthais > > wrote: > > > On Mon, 29 Apr 2019 19:59:49 +0300

Re: [ClusterLabs] Regarding Finalization Timer (I_ELECTION) just popped (1800000ms)

2019-05-15 Thread Ken Gaillot
crm_config: OK (rc=0, origin=local/crmd/93, version=0.102.0) > Oct 22 22:35:14 [76417] vm85c4465533 crmd: info: > plugin_handle_membership:Membership 52: quorum retained > Oct 22 22:35:14 [76417] vm85c4465533 crmd: info: > crmd_cs_dispatch:Setting expec

Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-05-15 Thread Ken Gaillot
On Wed, 2019-05-15 at 11:50 +0200, Jehan-Guillaume de Rorthais wrote: > On Mon, 29 Apr 2019 19:59:49 +0300 > Andrei Borzenkov wrote: > > > 29.04.2019 18:05, Ken Gaillot пишет: > > > > > > > > > Why does not it check OCF_RESKEY_CRM_meta_notif

Re: [ClusterLabs] Inconsistent clone $OCF_RESOURCE_INSTANCE value depending on symmetric-cluster property.

2019-04-29 Thread Ken Gaillot
e \ > last-lrm-refresh=1551115646 \ > have-watchdog=false > > And try to start m_Stateful again > > meta-data > OCF_RESOURCE_INSTANCE=Stateful_Test_1 > start > OCF_RESOURCE_INSTANCE=p_Stateful > promote > O

Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-04-29 Thread Ken Gaillot
tive on more than one node, returning the default > value for > clone-max > Attribute 'clone-max' not found for 'pgsql-ha' > Error performing operation: No such device or address > > > crm_resource --resource pgsqld --meta --get-parameter=cl

Re: [ClusterLabs] Pacemaker detail log directory permissions

2019-04-26 Thread Ken Gaillot
On Thu, 2019-04-25 at 18:49 +0200, Jan Pokorný wrote: > On 24/04/19 09:32 -0500, Ken Gaillot wrote: > > On Wed, 2019-04-24 at 16:08 +0200, wf...@niif.hu wrote: > > > Make install creates /var/log/pacemaker with mode 0770, owned by > > > hacluster:haclient. However

[ClusterLabs] Pacemaker 2.0.2-rc1 now available

2019-04-24 Thread Ken Gaillot
-Guillaume de Rorthais, Ken Gaillot, Klaus Wenninger, and Maciej Sobkowiak. A 1.1.21-rc1 with selected backports from this release will also be released soon. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo

[ClusterLabs] Coming in Pacemaker 2.0.2: changes of interest to packagers

2019-04-24 Thread Ken Gaillot
APIs are affected, the change does not break public API backward compatibility. (The new library will eventually contain some high-level public APIs.) -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users

Re: [ClusterLabs] Pacemaker detail log directory permissions

2019-04-24 Thread Ken Gaillot
ite the log is not an additional concern. With ACLs, I could see wanting to change the permissions, and that idea has come up already. One approach might be to add a PCMK_log_mode option that would default to 0660, and users could make it more strict if desired. -- Ken Gaillot ___

Re: [ClusterLabs] Failover event not reported correctly?

2019-04-18 Thread Ken Gaillot
dea is that failures might occur when you're not looking :) and you can see that they happened the next time you check the status, even if the cluster was able to recover successfully. To clear the history, run "crm_resource -C -r MyCluster" (or "pcs resource cleanup MyCluster" if you're using pcs). -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] shutdown of 2-Node cluster when power outage

2019-04-18 Thread Ken Gaillot
m your distribution uses (e.g. systemctl) should be sufficient. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Question about fencing

2019-04-17 Thread Ken Gaillot
> > > > > > > > > ___ > > > Manage your subscription: > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > ClusterLabs home: https://www.clusterlabs.org/ >

[ClusterLabs] Pacemaker security issues discovered and patched

2019-04-17 Thread Ken Gaillot
(unlikely to be of interest to most users) is that the hacluster user and haclient group must exist before running the executor and fencer regression tests. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users

[ClusterLabs] Coming in 2.0.2: check whether a date-based rule is expired

2019-04-16 Thread Ken Gaillot
cluster or resource property. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Resource not starting correctly

2019-04-15 Thread Ken Gaillot
if [ $? -eq 0 ]; then > return $OCF_SUCCESS > fi > > return $OCF_ERR_GENERIC > ;; > > *) > return $state > ;; > esac > } > > I know for a fact that, in one, myapp_launch gets invoked, and that > its ex

Re: [ClusterLabs] failure-timeout

2019-04-11 Thread Ken Gaillot
ut is a resource meta-attribute. Look at pcs's man page to see how to use "pcs resource meta" to set those. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] The service restart, when other node joins the cluster

2019-04-08 Thread Ken Gaillot
t;IP > |->The service restart, Why? > > Thanks & Regards > > Leonardo Assunção Hi, Does the output of crm_mon show any errors? Or the logs? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Unable to restart resources

2019-03-26 Thread Ken Gaillot
and also > from pmk2, several times, to no avail. > > How can I bring back the pmk1 node on correctly, so that everything > is how it originally was - i.e. with pmk1 up and running, and with > the resources also up and running in pmk1? > > > > > ___ > Manage your subscription: > htt

Re: [ClusterLabs] Colocation constraint moving resource

2019-03-26 Thread Ken Gaillot
On Tue, 2019-03-26 at 22:12 +0300, Andrei Borzenkov wrote: > 26.03.2019 17:14, Ken Gaillot пишет: > > On Tue, 2019-03-26 at 14:11 +0100, Thomas Singleton wrote: > > > Dear all > > > > > > I am encountering an issue with colocation constraints. > > &g

Re: [ClusterLabs] Antw: Re: Apache graceful restart not supported by heartbeat apache control script

2019-03-26 Thread Ken Gaillot
figuration of httpd is outside the scope of > the clustering software. > > I will proceed with my workarounds that I have found so far. > > Thank you all again so much for your quick help. > > Best regards, > Cole > > > On Mar 26, 2019, at 12:02 AM, Ulrich Windl &

Re: [ClusterLabs] Colocation constraint moving resource

2019-03-26 Thread Ken Gaillot
TestResourceNode1 allocation score on nodespare: 53 > native_color: TestResourceNode2 allocation score on node2: 50 > native_color: TestResourceNode2 allocation score on nodespare: > -INFINITY > native_color: TestResourceNode3 allocation score on node3: 10 > native_color: TestResourceNode3 allocation score on nodespare: 3 This seems like a bug to me. Can you attach (or e-mail me privately) the pe-input file that led to the above situation? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Apache graceful restart not supported by heartbeat apache control script

2019-03-25 Thread Ken Gaillot
tions with no downtime? > > I am new to this list and could not find a way to search the > archives, so if this question has already been answered, could you > point me to the search area and to the answer as well? > > Thank you in advance for your advice and recommendatio

Re: [ClusterLabs] Interface confusion

2019-03-19 Thread Ken Gaillot
qdevice: corosync is not enabled > Sending updated corosync.conf to nodes... > srv2cr1: Succeeded > srv1cr1: Succeeded > Corosync configuration reloaded > Starting corosync-qdevice... > srv2cr1: corosync-qdevice started > srv1cr1: corosync-qdevice started > > > >

Re: [ClusterLabs] Interface confusion

2019-03-19 Thread Ken Gaillot
node's point of view, the other node is lost. So, each will attempt to fence the other. A delay on one node in this situation makes it less likely that they both pull the trigger at the same time, ending up with both nodes dead. > Thank you > > pon., 18.03.2019, 16:19 użytkownik Ken Gaillo

Re: [ClusterLabs] logging for pacemaker_remote?

2019-03-18 Thread Ken Gaillot
aker are explained or is the only > documentation for that in the file itself? Just the file itself for now. I plan on eventually adding it to the Pacemaker Explained document, but it would just be the same info. > > Thanks again, > Mike > From: Users on behalf of Ken Gaillot > >

Re: [ClusterLabs] logging for pacemaker_remote?

2019-03-18 Thread Ken Gaillot
sked. > > Thank you for any help, > Mike -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.o

Re: [ClusterLabs] Interface confusion

2019-03-18 Thread Ken Gaillot
be replaced by node names as seen by pacemaker. > > Once you > > >>>>> set up > > >>>>> and start your cluster, run 'pcs status' to get (amongs > > other info) > > >>>> the > > >>>>> node names. In

Re: [ClusterLabs] FYI: clusterlabs.org server maintenance

2019-03-13 Thread Ken Gaillot
On Fri, 2019-03-08 at 15:24 -0600, Ken Gaillot wrote: > Hi all, > > The clusterlabs.org websites and mailing lists have up till now been > hosted on Andrew Beekhof's linode server, which he is now getting rid > of. Fabio DiNitto has graciously volunteered a VM for ClusterLab

Re: [ClusterLabs] Resource creation information

2019-03-12 Thread Ken Gaillot
the resource agent. :-) "pcs resource providers" will list the ones available on the local system. Pacemaker provides a few of its own, under the "pacemaker" provider; the resource-agents package provides a bunch under the "heartbeat" provider (a legacy name from the day

Re: [ClusterLabs] systemd dependencies

2019-03-11 Thread Ken Gaillot
ervice is in a state such > as the above. > I can manually restart pacemaker services and it's fine. > > many thanks, L. That's odd. corosync is the only required dependency. There are optional dependencies on dbus, syslog/rsyslog, and the usually empty resource-agents-deps target. -

<    2   3   4   5   6   7   8   9   10   11   >