Re: [ClusterLabs] Error in documentation of resource sets (collocation)?

2019-01-25 Thread Ken Gaillot
and should be "A is colocated > with B" You're right. I think that section would benefit from some rewording as well; the examples should be designed such that resource A is always placed first, which will make them easier to compare. I'll tr

[ClusterLabs] Pacemaker 2.0.1-rc4 now available

2019-01-30 Thread Ken Gaillot
regression tests and simulations, but we can't cover all possible use cases, so your feedback is important and appreciated. I will also release 1.1.20-rc2 with selected backports from this release soon. -- Ken Gaillot ___ Users mailing list: Users

Re: [ClusterLabs] need help in "Scratch Step-by-Step Instructions for Building Your First High-Availability " wget -O - http://localhost/server-status

2019-01-31 Thread Ken Gaillot
atus for localhost (via ::1) > > > How can i add the --no-check-certificate or any other configs to the > "pcs resource create WebSite ocf:heartbeat:apache > configfile=/etc/httpd/conf/httpd.conf statusurl=" > https://localhost/server-status; op monitor interval=1min Use htt

Re: [ClusterLabs] Resource not starting correctly

2019-04-15 Thread Ken Gaillot
if [ $? -eq 0 ]; then > return $OCF_SUCCESS > fi > > return $OCF_ERR_GENERIC > ;; > > *) > return $state > ;; > esac > } > > I know for a fact that, in one, myapp_launch gets invoked, and that > its ex

[ClusterLabs] Coming in 2.0.2: check whether a date-based rule is expired

2019-04-16 Thread Ken Gaillot
cluster or resource property. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Question about fencing

2019-04-17 Thread Ken Gaillot
> > > > > > > > > ___ > > > Manage your subscription: > > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > > > ClusterLabs home: https://www.clusterlabs.org/ >

Re: [ClusterLabs] shutdown of 2-Node cluster when power outage

2019-04-18 Thread Ken Gaillot
m your distribution uses (e.g. systemctl) should be sufficient. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-05-15 Thread Ken Gaillot
On Wed, 2019-05-15 at 11:50 +0200, Jehan-Guillaume de Rorthais wrote: > On Mon, 29 Apr 2019 19:59:49 +0300 > Andrei Borzenkov wrote: > > > 29.04.2019 18:05, Ken Gaillot пишет: > > > > > > > > > Why does not it check OCF_RESKEY_CRM_meta_notif

Re: [ClusterLabs] Regarding Finalization Timer (I_ELECTION) just popped (1800000ms)

2019-05-15 Thread Ken Gaillot
crm_config: OK (rc=0, origin=local/crmd/93, version=0.102.0) > Oct 22 22:35:14 [76417] vm85c4465533 crmd: info: > plugin_handle_membership:Membership 52: quorum retained > Oct 22 22:35:14 [76417] vm85c4465533 crmd: info: > crmd_cs_dispatch:Setting expec

Re: [ClusterLabs] Pacemaker not reacting as I would expect when two resources fail at the same time

2019-05-31 Thread Ken Gaillot
d"/> You want first-action="promote" in the above constraint, otherwise the slave being started (or the master being started but not yet promoted) is sufficient to start snmp_active_disabled (even though the colocation ensures it will only be started on the same node where t

Re: [ClusterLabs] Fence agent definition under Centos7.6

2019-05-31 Thread Ken Gaillot
ny information on how to > create one “from scratch”. > > Specifically, I need to know where to put mgpstonith on the target > system(s). Generally, I’d appreciate a pointer to any > documentation/specification relevant to writing code for a fence > agent. > >

Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-28 Thread Ken Gaillot
rce-agents library must look for PCMK_logfile as well as HA_logfile. In that case, the easiest solution will be for us to set PCMK_logfile explicitly in the shipped sysconfig file. I can squeeze that into the soon-to-be-released 2.0.2 since it's not a code change. > > Regards, > Yusuke -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-05 Thread Ken Gaillot
u could automate it with a fixed sleep or maybe a brief sleep plus crm_resource --wait. > Corosync 2.3.5-3ubuntu2.3 and Pacemaker 1.1.14-2ubuntu1.6 > > Sincerely, > Dirk > -- > Dirk Gassen > Senior Software Engineer | GetWellNetwork > o: 240.482.3146 > e: dgas...@getwellnetwork.com > To help people take an active role in their health journey -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] VirtualDomain and SELinux

2019-06-07 Thread Ken Gaillot
ged to manage SELinux in such a way where it all works? > > I'm on Centos 7.6 with selinux-policy-3.13.1-229.el7_6.12.noarch > > many thanks, L. Pacemaker processes run in the cluster_t context and so are subject to those policies. I'm not sure what all is available under that, but ma

Re: [ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-07 Thread Ken Gaillot
On Fri, 2019-06-07 at 16:19 +, Hayden,Robert wrote: > Thanks > Robert > > Robert Hayden | Sr. Technology Architect | Cerner Corporation | > > > -Original Message- > > From: Users On Behalf Of Ken > > Gaillot > > Sent: Thursday, June 6, 2019 5:

Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-07 Thread Ken Gaillot
ve another node that > could become the DC while restarting pacemaker? If I do add another > node then the problem doesn't seem to appear. Yes, that makes sense. > > Dirk > > On Wed, Jun 5, 2019 at 3:17 PM Ken Gaillot > wrote: > > On Wed, 2019-06-05 at 13:28 -0700, Dirk

Re: [ClusterLabs] cluster move resources back despite of stickiness ?

2019-06-07 Thread Ken Gaillot
ban/move commands creates permanent constraints that have to be cleared when you no longer want them. > > How do I ensure the cluster gets that the cost of behaving as above > is > too high. > > many thanks, L. -- Ken Gaillot ___ Man

Re: [ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-07 Thread Ken Gaillot
On Fri, 2019-06-07 at 10:15 +0100, lejeczek wrote: > On 06/06/2019 23:34, Ken Gaillot wrote: > > Hi all, > > > > It has been discovered that newer versions of selinux-policy > > prevent > > bundles in pacemaker 2.0 from logging. I have a straightforward > &

[ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-06 Thread Ken Gaillot
'm leaning to the in-code solution, but I want to ask if anyone thinks the bundle restarts on upgrade are a deal-breaker for a minor-minor release, and would prefer the packaged policy solution. -- Ken Gaillot ___ Manage your subscription: https://lists.clust

Re: [ClusterLabs] VirtualDomain and Resource_is_Too_Active ?? - problem/error

2019-05-29 Thread Ken Gaillot
peration stop failed 'not configured' (6) >error: unpack_rsc_op: Preventing HA-work9-win10-kvm from > re-starting anywhere: operation stop failed 'not configured' (6) >error: unpack_rsc_op:Preventing HA-work9-win10-kvm from > re-starting anywhere: operation stop failed 'not configured' (6) >error: unpack_rsc_op:Preventing HA-work9-win10-kvm from > re-starting anywhere: operation stop failed 'not configured' (6) >error: unpack_rsc_op:Preventing HA-work9-win10-kvm from > re-starting anywhere: operation stop failed 'not configured' (6) >error: native_create_actions:Resource HA-work9-win10-kvm is > active on 3 nodes (attempting recovery) > > Something buggy there, or I'm missing something obvious? > > many thanks, L. > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-29 Thread Ken Gaillot
respective sysconfig/default/conf.d file > > for > > pacemaker) will trigger export of HA_LOGFILE environment variable > > propagated subsequently towards the agent processes, and everything > > then works as expected. IOW. OCF and/or resource-agents are still > > reasonably decoupled, thankfully. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-05 Thread Ken Gaillot
pacemaker/pengine/pe- > input-78.bz2 > > So it looks like to me that the cluster is demoting ms_MariaDB from > Master to Slave. I'm not sure if I should have waited for something > else to occur? > > I have attached pe-input-76.bz2. > > Dirk > > On Wed, Jun

[ClusterLabs] Pacemaker 2.0.2 final release now available

2019-06-06 Thread Ken Gaillot
___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Pacemaker 2.0.2 final release now available

2019-06-06 Thread Ken Gaillot
On Thu, 2019-06-06 at 10:12 -0500, Ken Gaillot wrote: > While I appreciate brevity, this was my e-mail client eating a draft. :-/ Source code for the Pacemaker 2.0.2 and 1.1.21 releases is now available: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.2 https://github.

[ClusterLabs] Pacemaker 2.0.2-rc3 now available

2019-05-30 Thread Ken Gaillot
and appreciated. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-30 Thread Ken Gaillot
On Wed, 2019-05-29 at 09:23 -0500, Ken Gaillot wrote: > On Wed, 2019-05-29 at 16:53 +0900, 飯田雄介 wrote: > > Hi Ken and Jan, > > > > Thank you for your comment. > > > > I understand that solusion is to set PCMK_logfile in the sysconfig > > file. > > &

Re: [ClusterLabs] info: mcp_cpg_deliver: Ignoring process list sent by peer for local node

2019-05-29 Thread Ken Gaillot
617]: notice: Quorum > acquired > May 29 17:21:45 rider.private pacemakerd[51617]: notice: Node > whale.private state is now member > May 29 17:21:45 rider.private pacemakerd[51617]: notice: Node > swir.private state is now member > May 29 17:21:45 rider.private pacem

Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-06-14 Thread Ken Gaillot
that time, if you need information > just let me know. Yes the logs and pe-input files would be helpful. It sounds like a bug in the scheduler. What version of pacemaker are you running? > > Thanks. > > > Bernd -- Ken Gaillot ___

Re: [ClusterLabs] Two node cluster goes into split brain scenario during CPU intensive tasks

2019-06-25 Thread Ken Gaillot
acting to events until the end of the time. > > With Regards > Somanath Thilak J > > -Original Message- > From: Ken Gaillot > Sent: Monday, June 24, 2019 20:28 > To: Cluster Labs - All topics related to open-source clustering > welcomed ; Somanath Jeev

Re: [ClusterLabs] iSCSI Target resource starts on both nodes -- despite my colocation constraint

2019-06-25 Thread Ken Gaillot
On Mon, 2019-06-24 at 14:45 -0500, Bryan K. Walton wrote: > On Mon, Jun 24, 2019 at 12:02:59PM -0500, Ken Gaillot wrote: > > > Jun 20 11:48:36 storage1 crmd[240695]: notice: Transition 1 > > > (Complete=12, Pending=0, Fired=0, Skipped=0, Incomplete=0, > > > Source

Re: [ClusterLabs] FW: Fence agent definition under Centos7.6

2019-06-13 Thread Ken Gaillot
tch”. > > Specifically, I need to know where to put mgpstonith on the target > system(s). Generally, I’d appreciate a pointer to any > documentation/specification relevant to writing code for a fence > agent. > > Thanks, > Michael > ___ > Manage you

Re: [ClusterLabs] crmsh: Release 4.1.0

2019-06-21 Thread Ken Gaillot
gt; > > Diego Akechi > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Two node cluster goes into split brain scenario during CPU intensive tasks

2019-06-24 Thread Ken Gaillot
> > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] iSCSI Target resource starts on both nodes -- despite my colocation constraint

2019-06-24 Thread Ken Gaillot
online: > > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for ISCSIMillipedeIP on storage2: 7 (not running) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for ISCSICentipedeIP on storage2: 7 (not running) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for targetRHEVM on storage2: 0 (ok) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for targetVMStorage on storage2: 0 (ok) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for lunRHEVM on storage2: 7 (not running) > Jun 20 11:48:36 storage2 crmd[22305]: notice: Result of probe > operation for lunVMStorage on storage2: 7 (not running) > > > -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Minor regression in pacemaker 2.0.2

2019-06-12 Thread Ken Gaillot
; (which is fixed with this PR, as a beneficial side effect). -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] drbd could not start by pacemaker. strange limited root privileges?

2019-05-23 Thread Ken Gaillot
/home/opc/tmp/modprobe2.trace > # ls -l /home/opc/tmp/modprobe2.trace > -rw-r--r--. 1 root root 0 May 21 15:44 /home/opc/tmp/modprobe2.trace > > > Thanks: lados. > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] FYI to anyone backporting the recent security fixes

2019-05-24 Thread Ken Gaillot
: improve CPG membership messages d0c12d98e01bc6228fc254456927d79a46554448 Fix: libcrmcommon: avoid use-of-NULL when checking whether process is active c0e1cf579f57922cbe872d23edf144dd2206156b Low: libcrmcommon: return proper code if testing pid is denied -- Ken Gaillot

Re: [ClusterLabs] Antw: Re: Constant stop/start of resource in spite of interval=0

2019-05-20 Thread Ken Gaillot
: Inverting > > > name match > > > > for private xml > > > trace May 18 23:02:49 build_parameter_list(632):0: Adding attr > > > > name=eduroam IPv4 tunnel to the xml result > > > trace May 18 23:02:49 build_parameter_list(621):0: Rejecting id > > > for > > > > private > > > trace May 18 23:02:49 build_parameter_list(625):0: Inverting id > > > match for > > private xml > > > trace May 18 23:02:49 build_parameter_list(632):0: Adding attr > > > id=0 to > > the > > xml result > > > > > > > > > By the way, it's debian stretch with pacemaker 1.1.16-1. > > > > I have double and triple checked the agent and it seems just a > > normal, > > working agent. > > > > The agent accepts the reload operation, it is advertised in the > > actions > > section of its metadata, there are parameters with unique set to 0 > > and > > still stop/start is called instead of reload. (I could even live > > with > > reload instead of start/stop in every 15 mins). > > > > As a desperate attempt, I deleted the resource and re-added and it > > of > > course did not help. > > > > I also created the attached trace file during creating the resource > > in the > > hope that it could help find the reason of the permanent > > stop/start. > > > > Best regards, > > Jozsef > > -- > > E-mail : kadlecsik.joz...@wigner.mta.hu > > PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt > > Address: Wigner Research Centre for Physics, Hungarian Academy of > > Sciences > > H-1525 Budapest 114, POB. 49, Hungary > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.0.2-rc2 now available

2019-05-21 Thread Ken Gaillot
cover all possible use cases, so your feedback is important and appreciated. A 1.1.21-rc1 with selected backports from the 2.0.2 release candidates will also be released soon. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org

Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-05-16 Thread Ken Gaillot
On Thu, 2019-05-16 at 10:20 +0200, Jehan-Guillaume de Rorthais wrote: > On Wed, 15 May 2019 16:53:48 -0500 > Ken Gaillot wrote: > > > On Wed, 2019-05-15 at 11:50 +0200, Jehan-Guillaume de Rorthais > > wrote: > > > On Mon, 29 Apr 2019 19:59:49 +0300

Re: [ClusterLabs] Antw: Re: Constant stop/start of resource in spite of interval=0

2019-05-20 Thread Ken Gaillot
On Mon, 2019-05-20 at 23:15 +0200, Kadlecsik József wrote: > Hi, > > On Mon, 20 May 2019, Ken Gaillot wrote: > > > On Mon, 2019-05-20 at 15:29 +0200, Ulrich Windl wrote: > > > What worries me is "Rejecting name for unique". > > > > Trace me

Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-04-29 Thread Ken Gaillot
tive on more than one node, returning the default > value for > clone-max > Attribute 'clone-max' not found for 'pgsql-ha' > Error performing operation: No such device or address > > > crm_resource --resource pgsqld --meta --get-parameter=cl

Re: [ClusterLabs] Inconsistent clone $OCF_RESOURCE_INSTANCE value depending on symmetric-cluster property.

2019-04-29 Thread Ken Gaillot
e \ > last-lrm-refresh=1551115646 \ > have-watchdog=false > > And try to start m_Stateful again > > meta-data > OCF_RESOURCE_INSTANCE=Stateful_Test_1 > start > OCF_RESOURCE_INSTANCE=p_Stateful > promote > O

Re: [ClusterLabs] Failover event not reported correctly?

2019-04-18 Thread Ken Gaillot
dea is that failures might occur when you're not looking :) and you can see that they happened the next time you check the status, even if the cluster was able to recover successfully. To clear the history, run "crm_resource -C -r MyCluster" (or "pcs resource cleanup MyCluster" if you're using pcs). -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Pacemaker detail log directory permissions

2019-04-26 Thread Ken Gaillot
On Thu, 2019-04-25 at 18:49 +0200, Jan Pokorný wrote: > On 24/04/19 09:32 -0500, Ken Gaillot wrote: > > On Wed, 2019-04-24 at 16:08 +0200, wf...@niif.hu wrote: > > > Make install creates /var/log/pacemaker with mode 0770, owned by > > > hacluster:haclient. However

Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-07-05 Thread Ken Gaillot
igration > > should not be considered dangling. > > > > A couple of side notes on your configuration: > > > > Instead of putting action=off in fence device configurations, you > > should use pcmk_reboot_action=off. Pacemaker adds action when > >

Re: [ClusterLabs] colocation - but do not stop resources on failure

2019-07-10 Thread Ken Gaillot
On Wed, 2019-07-10 at 10:30 +0100, lejeczek wrote: > On 09/07/2019 20:26, Ken Gaillot wrote: > > On Tue, 2019-07-09 at 11:21 +0100, lejeczek wrote: > > > hi guys, > > > > > > how to, if possible, create colocation which would not stop > > > depen

Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-10 Thread Ken Gaillot
the entire node failed, then fencing comes into play. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-11 Thread Ken Gaillot
On Thu, 2019-07-11 at 10:39 +0100, lejeczek wrote: > On 10/07/2019 15:50, Ken Gaillot wrote: > > On Wed, 2019-07-10 at 11:26 +0100, lejeczek wrote: > > > hi guys, possibly @devel if they pop in here. > > > > > > is there, will there be, a way to make clus

Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-12 Thread Ken Gaillot
On Fri, 2019-07-12 at 13:33 +0100, lejeczek wrote: > On 11/07/2019 14:16, Ken Gaillot wrote: > > On Thu, 2019-07-11 at 10:39 +0100, lejeczek wrote: > > > On 10/07/2019 15:50, Ken Gaillot wrote: > > > > On Wed, 2019-07-10 at 11:26 +0100, lejeczek wrote: >

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-02 Thread Ken Gaillot
ondary > > Clone Set: ms_king_resource [king_resource] (promotable) > > Masters: [ secondary ] > > Slaves: [ primary ] > > Clone Set: ms_servant1 [servant1] > > Started: [ primary secondary ] > > Clone Set: ms_servant2 [servant2] (promotable) > > Masters: [ secondary ] > > Slaves: [ primary ] > > Clone Set: ms_servant3 [servant3] (promotable) > > Masters: [ secondary ] > > Slaves: [ primary ] > > servant4(lsb:servant4): Started secondary > > servant5 (lsb:servant5):Started secondary > > servant6 (lsb:servant6):Started secondary > > servant7 (lsb:servant7): Started secondary > > servant8 (lsb:servant8):Started secondary > > Resource Group: servant9_active_disabled > > servant9_resource1 (lsb:servant9_resource1):Started > > secondary > > servant9_resource2 (lsb:servant9_resource2): Started > > secondary > > servant10 (lsb:servant10): Started secondary > > servant11 (lsb:servant11): Started secondary > > servant12(lsb:servant12): Started secondary > > servant13(lsb:servant13): Started secondary > > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-07-03 Thread Ken Gaillot
: info: > update_cib_stonith_devices_v2:Updating device list from the cib: > delete lrm_resource[@id='vm_mouseidgenes'] > Jun 19 14:57:32 [9578] ha-idg-1 stonith-ng: info: > cib_devices_update: Updating devices to version 2.7007.891 > Jun 19 14:57:32 [9577] ha-idg-1cib: info: > cib_perform_op: Diff: --- 2.7

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Ken Gaillot
ry ] > servant4(lsb:servant4): Started secondary > servant5 (lsb:servant5):Started secondary > servant6 (lsb:servant6):Started secondary > servant7 (lsb:servant7): Started secondary > servant8 (lsb:servant8):Started secondary > Resourc

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Ken Gaillot
* Recoverking_resource:1 ( Master primary ) > > >> * Pseudo action: ms_king_resource_pre_notify_demote_0 > > >> * Resource action: king_resource notify on secondary > > >> * Resource action: king_re

Re: [ClusterLabs] Pacemaker detail log directory permissions

2019-04-24 Thread Ken Gaillot
ite the log is not an additional concern. With ACLs, I could see wanting to change the permissions, and that idea has come up already. One approach might be to add a PCMK_log_mode option that would default to 0660, and users could make it more strict if desired. -- Ken Gaillot ___

[ClusterLabs] Coming in Pacemaker 2.0.2: changes of interest to packagers

2019-04-24 Thread Ken Gaillot
APIs are affected, the change does not break public API backward compatibility. (The new library will eventually contain some high-level public APIs.) -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users

[ClusterLabs] Pacemaker 2.0.2-rc1 now available

2019-04-24 Thread Ken Gaillot
-Guillaume de Rorthais, Ken Gaillot, Klaus Wenninger, and Maciej Sobkowiak. A 1.1.21-rc1 with selected backports from this release will also be released soon. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo

Re: [ClusterLabs] failure-timeout

2019-04-11 Thread Ken Gaillot
ut is a resource meta-attribute. Look at pcs's man page to see how to use "pcs resource meta" to set those. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker security issues discovered and patched

2019-04-17 Thread Ken Gaillot
(unlikely to be of interest to most users) is that the hacluster user and haclient group must exist before running the executor and fencer regression tests. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users

Re: [ClusterLabs] Antw: Interacting with Pacemaker from my code

2019-07-16 Thread Ken Gaillot
es. Your agent has to set a master score for each node, and the node with the highest master score will be chosen as master. Your agent uses the crm_master command to do this (see its man page for details). Good luck, and let us know how it goes. > > > > > > Please let me know if you need any clarification or any other > > information. > > > > > > Thanks in advance !!! -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] resource location preference vs utilization

2019-07-16 Thread Ken Gaillot
hat. Have you tried putting small negative location constraints for the non-picky resources on those nodes? Also, you could try putting a smaller stickiness on the non-picky resources (I don't know whether that will have any effect, just trying to think of ideas). -- Ken Gaillot

Re: [ClusterLabs] Antw: Antw: Re: Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-13 Thread Ken Gaillot
t; > 13.08.2019 > > um > 10:07 in Nachricht <5d526fb002a100032...@gwsmtp.uni-regensburg.de > >: > > > > > Ken Gaillot schrieb am 13.08.2019 um > > > > > 01:03 in > > > > Nachricht > > : > > > On Mon, 2019‑08‑12 at 17

Re: [ClusterLabs] cloned ethmonitor - upon failure of all nodes

2019-08-15 Thread Ken Gaillot
y negative preference to a node where a single attribute is 0 (one rule for each attribute). -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] 2-node DRBD Pacemaker not performing as expected: Where to next?

2019-08-15 Thread Ken Gaillot
get better integrations > with RHEL 7, CentOS 7, an edge release of Ubuntu, or some other > distribution? There are advantages and disadvantages to changing either of the above, but I doubt any choice will be easier, just a different set of roadblocks to work through. > Thank y

Re: [ClusterLabs] Querying failed rersource operations from the CIB

2019-08-12 Thread Ken Gaillot
onder where to look for the reason) > BTW: "lrm_resources" is not documented, and the structure seemes to > change. Can I restrict the output to LRM data? One possibility is to run crm_mon with --as-xml and parse the failed actions from that output. The schema is

Re: [ClusterLabs] Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-12 Thread Ken Gaillot
resources are closely related, so changing the status of one member might affect what status the others report. > Regards, > Ulrich -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] why is node fenced ?

2019-08-12 Thread Ken Gaillot
5:47 crm node online ha-idg-2 > 17:26:35 crm node standby ha-idg1- > 17:30:21 zypper up (install Updates on ha-idg-1) > 17:37:32 systemctl reboot > 17:43:04 systemctl start pacemaker.service > 17:44:00 ha-idg-1 is fenced >

Re: [ClusterLabs] Master/slave failover does not work as expected

2019-08-12 Thread Ken Gaillot
default but isn't for historical reasons. It's a good idea to always use -- lifetime=reboot. You could double-check with "cibadmin -Q|grep master-" and see if there is more than one entry per node. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] why is node fenced ?

2019-08-14 Thread Ken Gaillot
dheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, > Heinrich Bassler, Kerstin Guenther > R

Re: [ClusterLabs] Q: "Re-initiated expired calculated failure"

2019-08-14 Thread Ken Gaillot
0:0;68:6583:0:d941efc1-de73-4ee4-b593- > > > f65be9e90726" > > > exit-reason="" on_node="h11" call-id="800" rc-code="0" op- > > > status="0" > > > i

Re: [ClusterLabs] node name issues (Could not obtain a node name for corosync nodeid 739512332)

2019-08-22 Thread Ken Gaillot
deid 739512331 > crmd: notice: get_node_name: Could not obtain a node name for > corosync nodeid 739512331 > crmd: info: pcmk_cpg_membership: Node 739512331 still member > of group crmd (peer=(null):7281, counter=0.0, at least once) > crmd: info: crm_update_peer_p

Re: [ClusterLabs] Q: "pengine[7280]: error: Characters left over after parsing '10#012': '#012'"

2019-08-22 Thread Ken Gaillot
g one of pe-*-series-max cluster options misconfigured.) > > > The message was written after an announced resource move to the new > > node. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] New status reporting for starting/stopping resources in 1.1.19-8.el7

2019-09-03 Thread Ken Gaillot
wonder whether this will be the default status reporting > going forward (I will try the 2.0 branch soon). It indeed was changed in the 2.0.0 release. RHEL 7 backported the change from there. > > Thanks, > Chris -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Q: Recommened directory for RA auxillary files?

2019-09-03 Thread Ken Gaillot
to follow the FHS, you might consider /usr/share if you're installing via custom packages, /usr/local/share if you're just installing locally, or /srv in either case. -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Q: The effect of using "default" attribute in RA metadata

2019-09-05 Thread Ken Gaillot
rLabs/OCF-spec The ra/next directory is where we're putting proposed changes (ra- api.rng is the RNG). Once accepted for the upcoming 1.1 standard, the changes are copied to the ra/1.1 directory, and at some point, 1.1 will be officially adopted as the current standard. So, pull request

Re: [ClusterLabs] Antw: Re: Q: Recommened directory for RA auxillary files?

2019-09-05 Thread Ken Gaillot
On Thu, 2019-09-05 at 07:57 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 04.09.2019 um > > > > 16:26 in > > Nachricht > <2634f19382b90736bdfb80b9c84997111479d337.ca...@redhat.com>: > > On Wed, 2019‑09‑04 at 10:07 +0200, Jehan‑Guillaum

Re: [ClusterLabs] Antw: Re: Q: Recommened directory for RA auxillary files?

2019-09-04 Thread Ken Gaillot
On Wed, 2019-09-04 at 10:09 +0200, Jehan-Guillaume de Rorthais wrote: > On Wed, 04 Sep 2019 07:54:50 +0200 > "Ulrich Windl" wrote: > > > > > > Ken Gaillot schrieb am 03.09.2019 um > > > > > 16:35 in > > > > Nachricht &

Re: [ClusterLabs] Q: Recommened directory for RA auxillary files?

2019-09-04 Thread Ken Gaillot
On Wed, 2019-09-04 at 10:07 +0200, Jehan-Guillaume de Rorthais wrote: > On Tue, 03 Sep 2019 09:35:39 -0500 > Ken Gaillot wrote: > > > On Mon, 2019-09-02 at 15:23 +0200, Ulrich Windl wrote: > > > Hi! > > > > > > Are there any recommendations where to pla

Re: [ClusterLabs] stonith-ng - performing action 'monitor' timed out with signal 15

2019-09-16 Thread Ken Gaillot
tion=status\nPARAMETER=VALUE\nPARAMETER=VALUE\n..." | /path/to/agent where PARAMETER=VALUE entries are what you have configured in the cluster. If the problem isn't obvious from that, you can try adding a debug_file parameter. -- Ken Gaillot __

Re: [ClusterLabs] Why is last-lrm-refresh part of the CIB config?

2019-09-09 Thread Ken Gaillot
ster status) are triggered by changes in the CIB. last-lrm-refresh isn't really special in any way, it's just a value that can be changed arbitrarily to trigger a new transition when nothing "real" is changing. I'm not sure what would actually be setting it these days; its use has

Re: [ClusterLabs] Corosync main process was not scheduled for 2889.8477 ms (threshold is 800.0000 ms), though it runs with realtime priority and there was not much load on the node

2019-09-09 Thread Ken Gaillot
c will "pet" from its main loop. From corosync.conf(5): > > > In a cluster with properly configured power fencing a watchdog > > provides no additional value. On the other hand, slow watchdog > > communication may incur multi-s

Re: [ClusterLabs] op stop timeout update causes monitor op to fail?

2019-09-11 Thread Ken Gaillot
gt; I attached the log of the transition. > > Regards, > Dennis -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Strange behaviour of group resource

2019-07-30 Thread Ken Gaillot
> > Thanks & Regards > > Dileep Nair > Squad Lead - SAP Base > Togaf Certified Enterprise Architect > IBM Services for Managed Applications > +91 98450 22258 Mobile > dilen...@in.ibm.com > > IBM Services > > > Ken Gaillot ---07/30/2019 12:47:52 AM---

Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-30 Thread Ken Gaillot
le your own, I'd go with the latest 1.1 or 2.0 version (currently 1.1.21 or 2.0.2). > > -Original Message- > From: Ken Gaillot > Sent: Tuesday, July 30, 2019 12:53 AM > To: Somanath Jeeva ; Cluster Labs - All > topics related to open-source clustering welcomed < > user

Re: [ClusterLabs] "resource cleanup" - but error message does not dissapear

2019-07-30 Thread Ken Gaillot
ed !?! > > Any idea ? > > > Bernd There was a regression in 1.1.20 and 2.0.0 (fixed in the next versions) where cleanups of multiple errors would miss some of them. Any chance you're using one of those? -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: how to connect to the cluster from a docker container

2019-08-07 Thread Ken Gaillot
new avenues for trouble, e.g. the user may be able to disable the hawk resource via hawk, but couldn't enable it again without resorting to the command line. Whatever approach you go with, in your case it's important to keep the pacemaker version inside the container the same or newer than the rest of the cluster. That's because it will need the schema files to validate the cluster configuration. (This isn't important for most containers, since they don't run any configuration commands.) > > Cheers, > > Dejan > > > Klaus > > > > > > > > Cheers, > > > > > > > > > > Dejan -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Reusing resource set in multiple constraints

2019-07-29 Thread Ken Gaillot
s://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#_tagging_configuration_elements -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Strange behaviour of group resource

2019-07-29 Thread Ken Gaillot
IBM Services for Managed Applications > +91 98450 22258 Mobile > dilen...@in.ibm.com > > IBM Services -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-29 Thread Ken Gaillot
> > > Somanath Thilak J > > > > > > > > > > > > _______________ > > > > Manage your subscription: > > > > > > https://protect2.fireeye.com/url?k=28466b53-74926310-28462bc8-86a1150 > > &

Re: [ClusterLabs] how to connect to the cluster from a docker container

2019-08-06 Thread Ken Gaillot
know for sure I haven't ever tried this in practice, some > > one else here could have. Also, there may be a lot of fun with > > various > > Linux Security Modules like SELinux. > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] corosync.service (and sbd.service) are not stopper on pacemaker shutdown when corosync-qdevice is used

2019-08-09 Thread Ken Gaillot
: > > > > > > > > crm cluster start > > > > crm cluster stop > > > > > > > > Cheers, > > > > Roger > > > > > > > > > Unfortunately, corosync-qdevice.service declares > > > > > Requires=corosync.service and corosync-qdevice.service itself > > > > > is *not* > > > > > stopped when pacemaker.service is stopped. Which means > > > > > corosync.service > > > > > remains "needed" and is never stopped. > > > > > > > > > > Also sbd.service (which is PartOf=corosync.service) remains > > > > > running > > > > > as well. > > > > > > > > > > The latter is really bad, as it means sbd watchdog can kick > > > > > in at any > > > > > time when user believes cluster stack is safely stopped. In > > > > > particular > > > > > if qnetd is not accessible (think network reconfiguration). -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] pacemaker alerts list

2019-07-17 Thread Ken Gaillot
alert types. The criticality would be derived from CRM_alert_desc, CRM_alert_rc, and CRM_alert_status. > > > > Thank you, > > Vlad > Equipment Management (EM) System Engineer -- Ken Gaillot ___ Manage your subscripti

Re: [ClusterLabs] Which shell definitions to include?

2019-07-23 Thread Ken Gaillot
ges systemd requires, but systemd > complains about not being able to unmount /var while /var/run or > /var/lock is being used... Agreed, it should be a build-time option whether to use /run or /var/run -- Ken Gaillot ___ Manage your sub

Re: [ClusterLabs] Antw: Re: Antw: Re: Resource won't start, crm_resource -Y does not help

2019-07-23 Thread Ken Gaillot
On Tue, 2019-07-23 at 07:57 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 22.07.2019 um > > > > 18:14 in Nachricht > > : > > On Mon, 2019-07-22 at 15:45 +0200, Ulrich Windl wrote: > > > Hi! > > > > > > My RA actually

Re: [ClusterLabs] [ClusterLabs Developers] pacemaker geo redundancy - 2 nodes

2019-07-23 Thread Ken Gaillot
> > > 2) Any disadvantages of going this way? > > > > > > > > > Thanks, > > > Rohit > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/developers > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] On the semantics of ocf_exit_reason()

2019-07-23 Thread Ken Gaillot
R_ARGS > elif [ "X${dest_tsap//[^-A-Za-z0-9._\\/]/}" != "X${dest_tsap}" ]; > then > ocf_log err "$me: invalid value $dest_tsap for \"dest\"" > result=$OCF_ERR_ARGS > fi > ... > > Regards, > Ulrich > > > __

Re: [ClusterLabs] 2 nodes split brain with token timeout

2019-07-23 Thread Ken Gaillot
> totem.cluster_name (str) = FRPLZABPXY > totem.interface.0.bindnetaddr (str) = 10.XX.YY.2 > totem.interface.0.broadcast (str) = yes > totem.interface.0.mcastport (u16) = 5405 > totem.secauth (str) = off > totem.totem (str) = 4000 > totem.transport (str) = udpu > to

Re: [ClusterLabs] pacemaker geo redundancy - 2 nodes

2019-07-15 Thread Ken Gaillot
nces of split brain depend on your workload. Something like a database or cluster filesystem could become horribly corrupted. > > > Thanks, > Rohit -- Ken Gaillot ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Feedback wanted: Node reaction to fabric fencing

2019-07-24 Thread Ken Gaillot
(reboot) 2. Default to the current behavior (stop) 3. Default to the current behavior for now, and change it to the correct behavior whenever pacemaker 2.1 is released (probably a few years from now) -- Ken Gaillot ___ Manage your subscription: https

Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-25 Thread Ken Gaillot
bc8-86a1150 > > > b > > > c3ba- > > > bb674f3a9b557cbd=1=https%3A%2F%2Flists.clusterlabs.org%2Fmai > > > l > > > man%2Flistinfo%2Fusers > > > > > > ClusterLabs home: > > > https://protect2.fireeye.com/url?k=4c5edd73-1

<    6   7   8   9   10   11   12   13   14   15   >