Re: [ClusterLabs] Error in documentation of resource sets (collocation)?

2019-01-25 Thread Ken Gaillot
On Thu, 2019-01-17 at 14:50 +0100, Michael Schwartzkopff wrote:
> Hi,
> 
> 
> When I read the documentation of the colocating sets of of resources,
> the last "note" section reads:
> 
> 
> Pay close attention to the order in which resources and sets are
> listed.
> While the colocation dependency for members of any one set is
> last-to-first, the colocation dependency for multiple sets is
> first-to-last. In the above example, B is colocated with A, but
> colocated-set-1 is colocated with colocated-set-2.
> 
> But in the example above the config is:
> 
> 
> 
> 
> 
> 
> 
> So reading form last-to first inside the resource set would be "A"
> with
> "B", which also correcpondens to the picture above. So the part "B is
> colocated with A" seems to be wrong and should be "A is colocated
> with B"

You're right. I think that section would benefit from some rewording as
well; the examples should be designed such that resource A is always
placed first, which will make them easier to compare. I'll try to get
that done for the next release candidate (FYI, only the 2.0
documentation is being updated at this point).
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Pacemaker 2.0.1-rc4 now available

2019-01-30 Thread Ken Gaillot
Source code for the fourth release candidate for Pacemaker version
2.0.1 is now available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.1-rc4

This candidate has a few more bug fixes. We should be getting close to
final release.

For those on the bleeding edge, the newest versions of GCC and glib
cause some issues. GCC 9 does stricter checking of print formats that
required a few log message fixes in this release (i.e. using GCC 9 with
the -Werror option will fail with any earlier release). A change in
glib 2.59.0's hash table implementation broke some of Pacemaker's
regression tests; for this release, these tests can be disabled with
the --disable-hash-affected-tests=try configure option (we'll make the
tests compatible as soon as practical, and that option will go away).

Everyone is encouraged to download, compile and test the new release.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

I will also release 1.1.20-rc2 with selected backports from this
release soon.
-- 
Ken Gaillot 



___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] need help in "Scratch Step-by-Step Instructions for Building Your First High-Availability " wget -O - http://localhost/server-status

2019-01-31 Thread Ken Gaillot
On Thu, 2019-01-31 at 16:09 +, Aliaj, Shpetim wrote:
> Hi Guys,
> 
> i am setting up a test cluster for apache, but on the step 6.4
> Configure the cluster. i have https enabled with a cert. 
> the cert is configured for the IP of the host not the localhost.
> 
> its say there that the wget -O - http://localhost/server-status is
> the command to check if the config is right.
> 
> that command does not work for me.
> 
> But the wget --no-check-certificate -O -  
> https://localhost/server-status
> --2019-01-31 11:04:08--  https://localhost/server-status
> Resolving localhost (localhost)... ::1, 127.0.0.1
> Connecting to localhost (localhost)|::1|:443... connected.
> WARNING: no certificate subject alternative name matches
>   requested host name ‘localhost’.
> HTTP request sent, awaiting response... 200 OK
> Length: 4225 (4.1K) [text/html]
> Saving to: ‘STDOUT’
> 
>  0%
> [
>   ] 0   --.-
> K/s   Final//EN">
> 
> Apache Status
> 
> Apache Server Status for localhost (via ::1)
> 
> 
> How can i add the --no-check-certificate or any other configs to the
> "pcs resource create WebSite ocf:heartbeat:apache
> configfile=/etc/httpd/conf/httpd.conf statusurl="
> https://localhost/server-status; op monitor interval=1min

Use http instead of https here

> 
> thanks a lot
> 
> Tim
> 
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Resource not starting correctly

2019-04-15 Thread Ken Gaillot
 ;;
> 
> $OCF_NOT_RUNNING)
>   myapp_launch > /dev/null 2>&1
>   if [ $?  -eq 0 ]; then
> return $OCF_SUCCESS
>   fi
> 
>   return $OCF_ERR_GENERIC
>   ;;
> 
> *)
>   return $state
>   ;;
>   esac
> }
> 
> I know for a fact that, in one, myapp_launch gets invoked, and that
> its exit value is 0. The function therefore returns OCF_SUCCESS, as
> it should. However, if I understand things correctly, the log entries
> in two seem to claim that the exit value of the script in one is
> OCF_NOT_RUNNING. 

The start succeeded. It's the recurring monitor that failed.

> 
> What's going on here? It's obviously something to do with myapp-
> script - but, what? 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Coming in 2.0.2: check whether a date-based rule is expired

2019-04-16 Thread Ken Gaillot
Hi all,

I wanted to point out an experimental feature that will be part of the
next release.

We are adding a "crm_rule" command that has the ability to check
whether a particular date-based rule is currently in effect.

The motivation is a perennial user complaint: expired constraints
remain in the configuration, which can be confusing.

We don't automatically remove such constraints, for several reasons: we
try to avoid modifying any user-specified configuration; expired
constraints are useful context when investigating an issue after it
happened; and crm_simulate can be run for any configuration for an
arbitrary past date to see what would have happened at that time.

The new command gives users (and high-level tools) a way to determine
whether a rule is in effect, so they can remove it themselves, whether
manually or in an automated way such as a cron.

You can use it like:

crm_rule -r  [-d ] [-X ]

With just -r, it will tell you whether the specified rule from the
configuration is currently in effect. If you give -d, it will check as
of that date and time (ISO 8601 format). If you give it -X, it will
look for the rule in the given XML rather than the CIB (you can also
use "-X -" to read the XML from standard input).

Example output:

% crm_rule -r my-current-rule
Rule my-current-role is still in effect

% crm_rule -r some-long-ago-rule
Rule some-long-ago-rule is expired

% crm_rule -r some-future-rule
Rule some-future-rule has not yet taken effect

% crm_rule -r some-recurring-rule
Could not determine whether rule some-recurring-rule is expired

Scripts can use the exit status to distinguish the various cases.

The command will be considered experimental for the 2.0.2 release; its
interface and behavior may change in future versions. The current
implementation has a limitation: the rule may contain only a single
date_expression, and the expression's operation must not be date_spec.

Other capabilities may eventually be added to crm_rule, for example the
ability to evaluate the current value of any cluster or resource
property.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Question about fencing

2019-04-17 Thread Ken Gaillot
On Wed, 2019-04-17 at 15:17 -0600, JCA wrote:
> Here is what I did:
> 
> # pcs stonith create disk_fencing fence_scsi pcmk_host_list="one two"
> pcmk_monitor_action="metadata" pcmk_reboot_action="off"
> devices="/dev/disk/by-id/ata-VBOX_HARDDISK_VBaaa429e4-514e8ecb" meta
> provides="unfencing"
> 
> where ata-VBOX-... corresponds to the device where I have the
> partition that is shared between both nodes in my cluster. The
> command completes without any errors (that I can see) and after that
> I have
> 
> # pcs status
> Cluster name: ClusterOne
> Stack: corosync
> Current DC: one (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition
> with quorum
> Last updated: Wed Apr 17 14:35:25 2019
> Last change: Wed Apr 17 14:11:14 2019 by root via cibadmin on one
> 
> 2 nodes configured
> 5 resources configured
> 
> Online: [ one two ]
> 
> Full list of resources:
> 
>  MyCluster(ocf::myapp:myapp-script):  Stopped
>  Master/Slave Set: DrbdDataClone [DrbdData]
>  Stopped: [ one two ]
>  DrbdFS   (ocf::heartbeat:Filesystem):Stopped
>  disk_fencing (stonith:fence_scsi):   Stopped

Your pcs command looks good to me. I'm perplexed why everything is
stopped.

Check the logs on the DC (node one in the output above) for error or
warning messages around this time. /var/log/messages is usually
sufficient, but the detail log will have (obviously) more details
(usually /var/log/pacemaker.log or /var/log/cluster/corosync.log).

> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> Things stay that way indefinitely, until I set stonith-enabled to
> false - at which point all the resources above get started
> immediately.
> 
> Obviously, I am missing something big here. But, what is it?
> 
> 
> On Wed, Apr 17, 2019 at 2:59 PM Adam Budziński <
> budzinski.a...@gmail.com> wrote:
> > You did not configure any fencing device.
> > 
> > śr., 17.04.2019, 22:51 użytkownik JCA <1.41...@gmail.com> napisał:
> > > I am trying to get fencing working, as described in the "Cluster
> > > from Scratch" guide, and I am stymied at get-go :-(
> > > 
> > > The document mentions a property named stonith-enabled. When I
> > > was trying to get my first cluster going, I noticed that my
> > > resources would start only when this property is set to false, by
> > > means of 
> > > 
> > > # pcs property set stonith-enabled=false
> > > 
> > > Otherwise, all the resources remain stopped.
> > > 
> > > I created a fencing resource for the partition that I am sharing
> > > across the the nodes, by means of DRBD. This works fine - but I
> > > still have the same problem as above - i.e. when stonith-enabled
> > > is set to true, all the resources get stopped, and remain in that
> > > state.
> > > 
> > > I am very confused here. Can anybody point me in the right
> > > direction out of this conundrum?
> > > 
> > > 
> > > 
> > > ___
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > 
> > > ClusterLabs home: https://www.clusterlabs.org/
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] shutdown of 2-Node cluster when power outage

2019-04-18 Thread Ken Gaillot
On Thu, 2019-04-18 at 16:11 +0200, Lentes, Bernd wrote:
> Hi,
> 
> i have a two-node cluster, both servers are buffered by an UPS.
> If power is gone the UPS sends after a configurable time a signal via
> network to shutdown the servers.
> The UPS-Software (APC Power Chute Network Shutdown) gives me on the
> host the possibility to run scripts
> before it shuts down.
> 
> What would be the right procedure to shutdown the complete cluster
> cleanly ?
> 
> Many Thanks.
> 
> 
> Bernd

Simply stopping pacemaker and corosync by whatever mechanism your
distribution uses (e.g. systemctl) should be sufficient.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-05-15 Thread Ken Gaillot
On Wed, 2019-05-15 at 11:50 +0200, Jehan-Guillaume de Rorthais wrote:
> On Mon, 29 Apr 2019 19:59:49 +0300
> Andrei Borzenkov  wrote:
> 
> > 29.04.2019 18:05, Ken Gaillot пишет:
> > > >  
> > > > > Why does not it check OCF_RESKEY_CRM_meta_notify?  
> > > > 
> > > > I was just not aware of this env variable. Sadly, it is not
> > > > documented
> > > > anywhere :(  
> > > 
> > > It's not a Pacemaker-created value like the other notify
> > > variables --
> > > all user-specified meta-attributes are passed that way. We do
> > > need to
> > > document that.  
> > 
> > OCF_RESKEY_CRM_meta_notify is passed also when "notify" meta-
> > attribute
> > is *not* specified, as well as a couple of others. But not all 

Hopefully in that case it's passed as false? I vaguely remember some
case where clone attributes were mistakenly passed to non-clone
resources, but I think notify is always accurate for clone resources.

> > possible
> > attributes. And some OCF_RESKEY_CRM_meta_* variables that are
> > passed do
> > not correspond to any user settable and documented meta-attribute,
> > like
> > OCF_RESKEY_CRM_meta_clone.
> 
> Sorry guys, now I am confused.

A well-known side effect of pacemaker ;)

> Is it safe or not to use OCF_RESKEY_CRM_meta_notify? You both doesn't
> seem to
> agree where it comes from. Is it only a non expected side effect or
> is it safe
> and stable code path in Pacemaker we can rely on?

It's reliable. All user-specified meta-attributes end up as environment
variables -- it's just meta-attributes that *aren't* specified by the
user that may or may not show up (but hopefully with the correct
value).

> 
> Does it worth a patch in pgsqlms RA?
> 
> Thanks,
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Regarding Finalization Timer (I_ELECTION) just popped (1800000ms)

2019-05-15 Thread Ken Gaillot
22 22:35:14 [76417] vm85c4465533   crmd: info:
> crm_update_peer_join:initialize_join: Node
> vm46890219c5[218109612] - join-3 phase 1 -> 0
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> crm_update_peer_join:initialize_join: Node
> vm85c4465533[83891884] - join-3 phase 1 -> 0
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> crm_update_peer_join:initialize_join: Node
> vm2ad44008dc[251664044] - join-3 phase 1 -> 0
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> join_make_offer: join-3: Sending offer to vm46890219c5
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> crm_update_peer_join:join_make_offer: Node
> vm46890219c5[218109612] - join-3 phase 0 -> 1
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> join_make_offer: join-3: Sending offer to vm85c4465533
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> crm_update_peer_join:join_make_offer: Node
> vm85c4465533[83891884] - join-3 phase 0 -> 1
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> join_make_offer: join-3: Sending offer to vm2ad44008dc
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> crm_update_peer_join:join_make_offer: Node
> vm2ad44008dc[251664044] - join-3 phase 0 -> 1
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> do_dc_join_offer_all:join-3: Waiting on 3 outstanding join
> acks
> Oct 22 22:35:14 [76412] vm85c4465533cib: info:
> cib_process_request: Completed cib_modify operation for section
> crm_config: OK (rc=0, origin=local/crmd/93, version=0.102.0)
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> plugin_handle_membership:Membership 52: quorum retained
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> crmd_cs_dispatch:Setting expected votes to 3
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> plugin_handle_membership:Membership 52: quorum retained
> Oct 22 22:35:14 [76412] vm85c4465533cib: info:
> cib_process_request: Completed cib_modify operation for section
> crm_config: OK (rc=0, origin=local/crmd/96, version=0.102.0)
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> crmd_cs_dispatch:Setting expected votes to 3
> Oct 22 22:35:14 [76417] vm85c4465533   crmd: info:
> update_dc:   Set DC to vm85c4465533 (3.0.10)
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pacemaker not reacting as I would expect when two resources fail at the same time

2019-05-31 Thread Ken Gaillot
On Thu, 2019-05-30 at 23:39 +, Harvey Shepherd wrote:
> Hi All,
> 
> I'm running Pacemaker 2.0.1 on a cluster containing two nodes; one
> master and one slave. I have a main master/slave resource
> (m_main_system), a group of resources that run in active-active mode
> (active_active - i.e. run on both nodes), and a group that runs in
> active-disabled mode (snmp_active_disabled - resources only run on
> the current promoted master). The snmp_active_disabled group is
> configured to be co-located with the master of m_main_system, so only
> a failure of the master m_main_system resource can trigger a
> failover. The constraints specify that m_main_system must be started
> before snmp_active_disabled.
> 
> The problem I'm having is that when a resource in the
> snmp_active_disabled group fails and gets into a constant cycle where
> Pacemaker tries to restart it, and I then kill m_main_system on the
> master, then Pacemaker still constantly tries to restart the failed
> snmp_active_disabled resource and ignores the more important
> m_main_system process which should be triggering a failover. If I
> stabilise the snmp_active_disabled resource then Pacemaker finally
> acts on the m_main_system failure. I hope I've described this well
> enough, but I've included a cut down form of my CIB config below if
> it helps!
> 
> Is this a bug or an error in my config? Perhaps the order in which
> the groups are defined in the CIB matters despite the constraints?
> Any help would be gratefully received.
> 
> Thanks,
> Harvey
> 
> 
>   
> 
>   
>   
>   
>   
>   
>   
> 
>   
>   
> 
> 
>   
>   
> 
> 
>   
> 
> 
> 
>   
> 
> 
>   
> 
> 
> 
>   
> 
> 
> 
>   
>  value="false"/>
>   
>   
> 
>   
> 
> 
> 
>   
> 
> 
>   
> 
> 
> 
>   
> 
>   
> 
> 
>   
> 
> 
> 
> 
>   
>type="main-system-ocf">
> 
>id="main_system-start-0"/>
>id="main_system-stop-0"/>
>id="main_system-promote-0"/>
>id="main_system-demote-0"/>
>role="Master" id="main_system-monitor-10s"/>
>role="Slave" id="main_system-monitor-11s"/>
>id="main_system-notify-0"/>
>  
>
> 
>   
>   
>  score="INFINITY" rsc="snmp_active_disabled" with-rsc="m_main_system"
> with-rsc-role="Master"/>
>  kind="Mandatory" first="m_main_system" then="snmp_active_disabled"/>

You want first-action="promote" in the above constraint, otherwise the
slave being started (or the master being started but not yet promoted)
is sufficient to start snmp_active_disabled (even though the colocation
ensures it will only be started on the same node where the master will
be).

I'm not sure if that's related to your issue, but it's worth trying
first.

>  first="m_main_system" then="clone_active_active"/>

You may also want to set interleave to true on clone_active_active, if
you want it to depend only on the local instance of m_main_system, and
not both instances.

>   
>   
> 
>   
>   
>   
> 
>   
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Fence agent definition under Centos7.6

2019-05-31 Thread Ken Gaillot
On Fri, 2019-05-31 at 22:32 +, Michael Powell wrote:
> Although I am personally a novice wrt cluster operation, several
> years ago my company developed a product that used Pacemaker.  I’ve
> been charged with porting that product to a platform running Centos
> 7.6.  The old product ran Pacemaker 1.1.13 and heartbeat.  For the
> most part, the transition to Pacemaker 1.1.19 and Corosync has gone
> pretty well, but there’s one aspect that I’m struggling with: fence-
> agents.
>  
> The old product used a fence agent developed in house to implement
> STONITH.  While it was no trouble to compile and install the code,
> named mgpstonith, I see lots of messages like the following in the
> system log –
>  
> stonith-ng[31120]:error: Unknown fence agent:
> external/mgpstonith  

Support for the "external" fence agents (a.k.a. Linux-HA-style agents)
is a compile-time option in pacemaker because it requires the third-
party cluster-glue library. CentOS doesn't build with that support.

Your options are either build cluster-glue and pacemaker yourself
instead of using the CentOS pacemaker packages, or rewrite the agent to
be an RHCS-style agent:

https://github.com/ClusterLabs/fence-agents/blob/master/doc/FenceAgentAPI.md
  
> stonith-ng[31120]:error: Agent external/mgpstonith not found or
> does not support meta-data: Invalid argument (22)
> stonith-ng[31120]:error: Could not retrieve metadata for fencing
> agent external/mgpstonith   
>  
> I’ve put debug messages in mgpstonith, and as they do not appear in
> the system log, I’ve inferred that it is in fact never executed.
>  
> Initially, I installed mgpstonith on /lib64/stonith/plugins/external,
> which is where it was located on the old product.  I’ve copied it to
> other locations, e.g. /usr/sbin, with no better luck.  I’ve searched
> the web and while I’ve found lots of information about using the
> available fence agents, I’ve not turned up any information on how to
> create one “from scratch”.
>  
> Specifically, I need to know where to put mgpstonith on the target
> system(s).  Generally, I’d appreciate a pointer to any
> documentation/specification relevant to writing code for a fence
> agent.
>  
> Thanks,
>   Michael
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-28 Thread Ken Gaillot
On Mon, 2019-05-27 at 14:12 +0900, 飯田雄介 wrote:
> Hi,
> 
> I am verifying the operation of the cluster with RHEL8.
> In the verification, I noticed that resource-agents log was not
> output to /var/log/pacemaker/pacemaker.log.
> "/etc/sysconfig/pacemaker" is used by default.
> 
> I know that resource-agents logs are output when passing the
> HA_logfile environment variable.

That's an interesting side effect ... with pacemaker 2.0 (which is in
RHEL 8), one of the changes was that pacemaker now always uses its own
log rather than using any corosync log that is configured.

Before that change, when pacemaker detected corosync's log, it would
set HA_logfile in the environment. Since it's not looking for that log
anymore, all that code is gone.

Pacemaker on RHEL7 has confirmed that this environment variable is
> set.
> ```
> # cat /proc/$(pidof /usr/libexec/pacemaker/lrmd)/environ | tr '\0'
> '\n' | sort
> HA_LOGD=no
> HA_LOGFACILITY=daemon
> HA_cluster_type=corosync
> HA_debug=0
> HA_logfacility=daemon
> HA_logfile=/var/log/cluster/corosync.log
> HA_mcp=true
> HA_quorum_type=corosync
> HA_use_logd=off
> LANG=ja_JP.UTF-8
> LC_ALL=C
> NOTIFY_SOCKET=/run/systemd/notify
> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
> PCMK_cluster_type=corosync
> PCMK_debug=0
> PCMK_logfacility=daemon
> PCMK_logfile=/var/log/cluster/corosync.log
> PCMK_mcp=true
> PCMK_quorum_type=corosync
> PCMK_service=pacemakerd
> PCMK_use_logd=off
> PCMK_watchdog=false
> VALGRIND_OPTS=--leak-check=full --tresource-agentsce-children=no --
> vgdb=no --num-callers=25 --log-file=/var/lib/pacemaker/valgrind-%p --
> suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions --
> gen-suppressions=all
> ```
> 
> However, it seems that this environment variable is not set in RHEL8.
> ```
> # cat /proc/$(pidof /usr/libexec/pacemaker/pacemaker-execd)/environ |
> tr '\0' '\n' | sort
> HA_LOGFACILITY=daemon
> HA_cluster_type=corosync
> HA_debug=0
> HA_logfacility=daemon
> HA_mcp=true
> HA_quorum_type=corosync
> INVOCATION_ID=6204f0841b814f6c92ea20db02b8ec9e
> JOURNAL_STREAM=9:1314759
> LANG=ja_JP.UTF-8
> LC_ALL=C
> NOTIFY_SOCKET=/run/systemd/notify
> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
> PCMK_cluster_type=corosync
> PCMK_debug=0
> PCMK_logfacility=daemon
> PCMK_mcp=true
> PCMK_quorum_type=corosync
> PCMK_service=pacemakerd
> PCMK_watchdog=false
> SBD_DELAY_START=no
> SBD_OPTS=
> SBD_PACEMAKER=yes
> SBD_STARTMODE=always
> SBD_TIMEOUT_ACTION=flush,reboot
> SBD_WATCHDOG_DEV=/dev/watchdog
> SBD_WATCHDOG_TIMEOUT=5
> VALGRIND_OPTS=--leak-check=full --tresource-agentsce-children=no --
> vgdb=no --num-callers=25 --log-file=/var/lib/pacemaker/valgrind-%p --
> suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions --
> gen-suppressions=all
> ```
> 
> Is this the intended behavior?
> 
> By the way, when /var/log/pacemaker/pacemaker.log is explicitly set
> in the PCMK_logfile, it is confirmed that the resource-agents log is
> output to the file set in the PCMK_logfile.

Interesting ... the resource-agents library must look for PCMK_logfile
as well as HA_logfile. In that case, the easiest solution will be for
us to set PCMK_logfile explicitly in the shipped sysconfig file. I can
squeeze that into the soon-to-be-released 2.0.2 since it's not a code
change.

> 
> Regards,
> Yusuke
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-05 Thread Ken Gaillot
On Wed, 2019-06-05 at 07:40 -0700, Dirk Gassen wrote:
> Hi,
> 
> I have the following CIB:
> > primitive AppserverIP IPaddr \
> > params ip=10.1.8.70 cidr_netmask=255.255.255.192 nic=eth0 \
> > op monitor interval=30s
> > primitive MariaDB mysql \
> > params binary="/usr/bin/mysqld_safe"
> pid="/var/run/mysqld/mysqld.pid" socket="/var/run/mysqld/mysqld.sock"
> replication_user=repl replication_passwd="r3plic@tion"
> max_slave_lag=15 evict_outdated_slaves=false test_user=repl
> test_passwd="r3plic@tion" config="/etc/mysql/my.cnf" user=mysql
> group=mysql datadir="/opt/mysql" \
> > op monitor interval=27s role=Master OCF_CHECK_LEVEL=1 \
> > op monitor interval=35s timeout=30 role=Slave
> OCF_CHECK_LEVEL=1 \
> > op start interval=0 timeout=130 \
> > op stop interval=0 timeout=130
> > ms ms_MariaDB MariaDB \
> > meta master-max=1 master-node-max=1 clone-node-max=1
> notify=true globally-unique=false target-role=Started is-managed=true
> > colocation colo_sm_aip inf: AppserverIP:Started ms_MariaDB:Master
> 
> When I do "crm node testras3 maintenance && systemctl stop pacemaker
> && systemctl start pacemaker && crm node testras3 ready" the cluster
> decides to demote ms_MariaDB and (because of the colocation) to stop
> AppserverIP. it then follows up immediately with promoting ms_MariaDB
> and starting AppserverIP again.
> 
> If I leave out restarting pacemaker the cluster does not demote
> ms_MariaDB and AppserverIP is left running.
> 
> Why is the demotion happening and is there a way to avoid this?

It looks like there isn't enough time between starting pacemaker and
taking the node out of maintenance for pacemaker to re-detect the state
of all resources. It's best to do that manually, i.e. wait for the
status output to show all the resources again, but you could automate
it with a fixed sleep or maybe a brief sleep plus crm_resource --wait.

> Corosync 2.3.5-3ubuntu2.3 and Pacemaker 1.1.14-2ubuntu1.6
> 
> Sincerely,
> Dirk
> -- 
> Dirk Gassen
> Senior Software Engineer | GetWellNetwork
> o: 240.482.3146
> e: dgas...@getwellnetwork.com
> To help people take an active role in their health journey
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] VirtualDomain and SELinux

2019-06-07 Thread Ken Gaillot
On Thu, 2019-06-06 at 11:58 +0100, lejeczek wrote:
> hi guys,
> 
> I'm trying to find a good location for xml config file but I think I
> have not enough booleans (not only for xml config) and SELinux is
> stopping pacemaker to start/manage virts domains.
> 
> Anybody managed to manage SELinux in such a way where it all works?
> 
> I'm on Centos 7.6 with selinux-policy-3.13.1-229.el7_6.12.noarch
> 
> many thanks, L.

Pacemaker processes run in the cluster_t context and so are subject to
those policies. I'm not sure what all is available under that, but
maybe something under /etc?
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-07 Thread Ken Gaillot
On Fri, 2019-06-07 at 16:19 +, Hayden,Robert wrote:
> Thanks
> Robert
> 
> Robert Hayden | Sr. Technology Architect | Cerner Corporation |
> 
> > -Original Message-
> > From: Users  On Behalf Of Ken
> > Gaillot
> > Sent: Thursday, June 6, 2019 5:35 PM
> > To: Cluster Labs - All topics related to open-source clustering
> > welcomed
> > 
> > Subject: [ClusterLabs] Possible intrusive change in bundles for
> > 2.0.3
> > 
> > Hi all,
> > 
> > It has been discovered that newer versions of selinux-policy
> > prevent bundles
> > in pacemaker 2.0 from logging. I have a straightforward fix, but it
> > means that
> > whenever a cluster is upgraded from pre-2.0.3 to
> > 2.0.3 or later, all active bundles will restart once the last older
> > node leaves
> > the cluster.
> 
> Is this cluster restart only when crossing the 2.0.3 release?  Or for
> each minor after the 2.0.3?

It would only be for crossing 2.0.3. Only bundle resources are
affected, not all resources in the cluster.

> Rolling upgrades are ideal and much easier to justify getting
> maintenance windows
> scheduled.
> 
> > 
> > This is because the fix passes the "Z" mount flag to docker or
> > podman, which
> > tells them to create a custom SELinux policy for the bundle's
> > container and
> > log directory. This is the easiest and most restrictive solution.
> > 
> > An alternative approach would be for pacemaker to start delivering
> > its own
> > custom SELinux policy as a separate package. The policy would allow
> > all
> > pacemaker-launched containers to access all of
> > /var/log/pacemaker/bundles, which is a bit broader access (not to
> > mention
> > more of a pain to maintain over the longer term). This would avoid
> > the
> > restart.
> > 
> > I'm leaning to the in-code solution, but I want to ask if anyone
> > thinks the
> > bundle restarts on upgrade are a deal-breaker for a minor-minor
> > release, and
> > would prefer the packaged policy solution.
> 
> I am not 100% sure of the configuration you are referring to with
> bundles.

It's a relatively new type of pacemaker resource for running containers
along with the IP addresses/ports and exported directories they need.
No other resources would be affected.

> Overall, I would prefer the SELinux policy to be a separate package,
> or incorporated into the
> main SELinux policies as a Boolean.  Seems to me to be a better long
> term solution,
> albeit painful.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-07 Thread Ken Gaillot
On Fri, 2019-06-07 at 07:31 -0700, Dirk Gassen wrote:
> Thanks, that seems to have been the problem in my case. (For some
> reason the attribute did not reappear on its own, but adding it
> manually w/ crm_attribute did work).
> 
> I assume that this happened since I didn't have another node that
> could become the DC while restarting pacemaker? If I do add another
> node then the problem doesn't seem to appear.

Yes, that makes sense.

> 
> Dirk
> 
> On Wed, Jun 5, 2019 at 3:17 PM Ken Gaillot 
> wrote:
> > On Wed, 2019-06-05 at 13:28 -0700, Dirk Gassen wrote:
> > > Thanks for your quick reply. I should have been a bit more
> > verbose in
> > > my problem description.
> > > 
> > > After starting up pacemaker again and before "crm node testras3
> > > ready" I did actually monitor the cluster with "crm_mon" and
> > waited
> > > until it indicated that it knew about the states of the
> > resources.
> > > 
> > > Here is actually the excerpt from syslog:
> > > * crm node maintenance testras3
> > > > 16:14:50 On loss of CCM Quorum: Ignore
> > > > 16:14:50 Forcing unmanaged master MariaDB:0 to remain promoted
> > on
> > > testras3
> > > > 16:14:50 Calculated Transition 12:
> > /var/lib/pacemaker/pengine/pe-
> > > input-72.bz2
> > > * systemctl stop pacemaker
> > > > 16:15:29 On loss of CCM Quorum: Ignore
> > > > 16:15:29 Forcing unmanaged master MariaDB:0 to remain promoted
> > on
> > > testras3
> > 
> > Ah, there is no master score for MariaDB, so when the node leaves
> > maintenance mode, the resource must be demoted.
> > 
> > Restarting pacemaker clears all transient node attributes
> > (including
> > the master score). The next monitor would set it again, but
> > maintenance
> > mode cancels monitors, so it won't run until it comes out of
> > maintenance mode, at which point it wants to do the demote.
> > 
> > A good way around this would be to unmanage the MariaDB resource
> > before
> > putting the node in maintenance. When you take the node out of
> > maintenance, the monitor will start up again, but it won't take any
> > actions. Once the monitor runs and sets the master score (which you
> > can
> > confirm with crm_master --query --resource MariaDB --node ),
> > you
> > can manage the resource.
> > 
> > > > 16:15:29 Scheduling Node testras3 for shutdown
> > > > 16:15:29 Calculated Transition 13:
> > /var/lib/pacemaker/pengine/pe-
> > > input-73.bz2
> > > > 16:15:29 Invoking handler for signal 15: Terminated
> > > * systemctl start pacemaker
> > > > 16:15:57 Additional logging available in /var/log/pacemaker.log
> > > > 16:16:20 On loss of CCM Quorum: Ignore
> > > > 16:16:20 Calculated Transition 0:
> > /var/lib/pacemaker/pengine/pe-
> > > input-74.bz2
> > > > 16:16:20 On loss of CCM Quorum: Ignore
> > > > 16:16:20 Forcing unmanaged master MariaDB:0 to remain promoted
> > on
> > > testras3
> > > > 16:16:20 Calculated Transition 1:
> > /var/lib/pacemaker/pengine/pe-
> > > input-75.bz2
> > > * crm node ready testras3
> > > > 16:18:01 On loss of CCM Quorum: Ignore
> > > > 16:18:01 StopAppserverIP#011(testras3)
> > > > 16:18:01 Demote  MariaDB:0#011(Master -> Slave testras3)
> > > > 16:18:01 Calculated Transition 2:
> > /var/lib/pacemaker/pengine/pe-
> > > input-76.bz2
> > > > 16:18:01 On loss of CCM Quorum: Ignore
> > > > 16:18:01 Start   AppserverIP#011(testras3)
> > > > 16:18:01 Promote MariaDB:0#011(Slave -> Master testras3)
> > > > 16:18:01 Calculated Transition 3:
> > /var/lib/pacemaker/pengine/pe-
> > > input-77.bz2
> > > > 16:18:02 On loss of CCM Quorum: Ignore
> > > > 16:18:02 Calculated Transition 4:
> > /var/lib/pacemaker/pengine/pe-
> > > input-78.bz2
> > > 
> > > So it looks like to me that the cluster is demoting ms_MariaDB
> > from
> > > Master to Slave. I'm not sure if I should have waited for
> > something
> > > else to occur?
> > > 
> > > I have attached pe-input-76.bz2.
> > > 
> > > Dirk
> > > 
> > > On Wed, Jun 5, 2019 at 10:22 AM Ken Gaillot 
> > > wrote:
> > > > On Wed, 2019-06-05 at 07:40 -0700, Dirk Gassen wrote:
> > > > > Hi,
> > > > > 
> > > > > I have the following C

Re: [ClusterLabs] cluster move resources back despite of stickiness ?

2019-06-07 Thread Ken Gaillot
On Fri, 2019-06-07 at 10:27 +0100, lejeczek wrote:
> hi guys,
> 
> I have a small cluster with:
> 
> $ pcs property set default-resource-stickiness=100

The above is deprecated, the below is current

> $ pcs resource defaults resource-stickiness=100
> 
> $ pcs resource meta PRIV-204 stickiness=100 (on which other resources
> colocate-depend)
> 
> When I stop A node (or cluster services) on which resources are
> running
> then these are moved to another B node, all good. But these resource
> do
> not stick to that B node when A node comes back on line, resource
> move
> again back to node A.
> 
> What is that I'm missing? Why resources do not stick as they should?

There must be some other score outweighing the stickiness. Show the
cluster constraints to see if there is anything you don't expect. In
particular, using ban/move commands creates permanent constraints that
have to be cleared when you no longer want them.

> 
> How do I ensure the cluster gets that the cost of behaving as above
> is
> too high.
> 
> many thanks, L.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-07 Thread Ken Gaillot
On Fri, 2019-06-07 at 10:15 +0100, lejeczek wrote:
> On 06/06/2019 23:34, Ken Gaillot wrote:
> > Hi all,
> > 
> > It has been discovered that newer versions of selinux-policy
> > prevent
> > bundles in pacemaker 2.0 from logging. I have a straightforward
> > fix,
> > but it means that whenever a cluster is upgraded from pre-2.0.3 to
> > 2.0.3 or later, all active bundles will restart once the last older
> > node leaves the cluster.
> > 
> > This is because the fix passes the "Z" mount flag to docker or
> > podman,
> > which tells them to create a custom SELinux policy for the bundle's
> > container and log directory. This is the easiest and most
> > restrictive
> > solution.
> > 
> > An alternative approach would be for pacemaker to start delivering
> > its
> > own custom SELinux policy as a separate package. The policy would
> > allow
> > all pacemaker-launched containers to access all of
> > /var/log/pacemaker/bundles, which is a bit broader access (not to
> > mention more of a pain to maintain over the longer term). This
> > would
> > avoid the restart.
> > 
> > I'm leaning to the in-code solution, but I want to ask if anyone
> > thinks
> > the bundle restarts on upgrade are a deal-breaker for a minor-minor
> > release, and would prefer the packaged policy solution.
> 
> I personally could live with such a case of restart on my small
> deployment.
> 
> (what does the "bundle" constitute?)

Bundles are a special type of pacemaker resource for running
containers. If you don't use them, you'll be unaffected. :)

> But there is more in terms of SELinux which should be investigated
> and
> fixed when it comes to pacemaker. Yesterday I had to prep a custom
> selinux module because SE policies stop pacemaker from
> starting/managing
> virt domain with storage off a gluster volume and xml config in
> /var/lib/pacemaker.
> 
> thanks, L.

Yes, unfortunately SELinux is very service-specific, so each resource
must be evaluated individually when SELinux is enabled. Also pacemaker
runs in the cluster_t context, so it's subject to those policies as far
as file creation etc.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Possible intrusive change in bundles for 2.0.3

2019-06-06 Thread Ken Gaillot
Hi all,

It has been discovered that newer versions of selinux-policy prevent
bundles in pacemaker 2.0 from logging. I have a straightforward fix,
but it means that whenever a cluster is upgraded from pre-2.0.3 to
2.0.3 or later, all active bundles will restart once the last older
node leaves the cluster.

This is because the fix passes the "Z" mount flag to docker or podman,
which tells them to create a custom SELinux policy for the bundle's
container and log directory. This is the easiest and most restrictive
solution.

An alternative approach would be for pacemaker to start delivering its
own custom SELinux policy as a separate package. The policy would allow
all pacemaker-launched containers to access all of
/var/log/pacemaker/bundles, which is a bit broader access (not to
mention more of a pain to maintain over the longer term). This would
avoid the restart.

I'm leaning to the in-code solution, but I want to ask if anyone thinks
the bundle restarts on upgrade are a deal-breaker for a minor-minor
release, and would prefer the packaged policy solution.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] VirtualDomain and Resource_is_Too_Active ?? - problem/error

2019-05-29 Thread Ken Gaillot
: Invalid recurring action chenbro0.1-raid5-mnt-stop-
> interval-90
> wth name: 'stop'
>error: Resource HA-work9-win10-kvm is active on 3 nodes
> (attempting
> recovery)
>   notice: See
> https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more
> information
>error: Calculated transition 1867 (with errors), saving inputs in
> /var/lib/pacemaker/pengine/pe-error-58.bz2
>   notice: Transition 1867 (Complete=0, Pending=0, Fired=0, Skipped=0,
> Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-error-58.bz2):
> Complete
>   notice: Configuration ERRORs found during PE processing.  Please
> run
> "crm_verify -L" to identify issues.
> ...
> 
> 
> $ pcs status --all
> 
> ...
> 
> Failed Actions:
> * HA-work9-win10-kvm_stop_0 on whale.private 'unknown error' (1):
> call=205, status=complete, exitreason='forced stop failed',
> last-rc-change='Wed May 29 11:32:23 2019', queued=0ms,
> exec=3158ms
> * HA-work9-win10-kvm_stop_0 on swir.private 'unknown error' (1):
> call=125, status=complete, exitreason='forced stop failed',
> last-rc-change='Wed May 29 11:32:23 2019', queued=0ms,
> exec=3398ms
> * HA-work9-win10-kvm_stop_0 on rider.private 'unknown error' (1):
> call=129, status=complete, exitreason='forced stop failed',
> last-rc-change='Wed May 29 11:32:23 2019', queued=0ms,
> exec=2934ms
> 
> $ crm_verify -L -V
>error: unpack_rsc_op:Preventing HA-work9-win10-kvm from
> re-starting anywhere: operation stop failed 'not configured' (6)
>error: unpack_rsc_op:Preventing HA-work9-win10-kvm from
> re-starting anywhere: operation stop failed 'not configured' (6)
>error: unpack_rsc_op:    Preventing HA-work9-win10-kvm from
> re-starting anywhere: operation stop failed 'not configured' (6)
>error: unpack_rsc_op:Preventing HA-work9-win10-kvm from
> re-starting anywhere: operation stop failed 'not configured' (6)
>error: unpack_rsc_op:Preventing HA-work9-win10-kvm from
> re-starting anywhere: operation stop failed 'not configured' (6)
>error: unpack_rsc_op:Preventing HA-work9-win10-kvm from
> re-starting anywhere: operation stop failed 'not configured' (6)
>error: native_create_actions:Resource HA-work9-win10-kvm is
> active on 3 nodes (attempting recovery)
> 
> Something buggy there, or I'm missing something obvious?
> 
> many thanks, L.
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-29 Thread Ken Gaillot
On Wed, 2019-05-29 at 16:53 +0900, 飯田雄介 wrote:
> Hi Ken and Jan,
> 
> Thank you for your comment.
> 
> I understand that solusion is to set PCMK_logfile in the sysconfig
> file.
> 
> As a permanent fix, if you use the default values inside Pacemaker,
> how about setting environment variables using set_daemon_option()
> there?

That would be better. I was just going to change the shipped sysconfig
because it's easy to do immediately, but changing the code would handle
cases where users auto-generate a sysconfig that doesn't include it,
launch pacemaker manually for testing, etc. However that'll have to
wait for the next release.

> For example, as PCMK_logficility does.
> 
https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-2.0.2-rc2/lib/common/logging.c#L806
> 
> BTW, Pacemaker writes to /var/log/pacemaker/pacemaker.log via libqb.
> RA writes to this file with echo redirect.
> If writing occurs at the same time, is there a risk that the file may
> be corrupted or the written log may disappear?
> I have never actually had a problem, but I'm interested in how this
> might happen.
> 
> Regards,
> Yusuke
> 
> 2019年5月28日(火) 23:56 Jan Pokorný :
> > On 28/05/19 09:29 -0500, Ken Gaillot wrote:
> > > On Mon, 2019-05-27 at 14:12 +0900, 飯田雄介 wrote:
> > >> By the way, when /var/log/pacemaker/pacemaker.log is explicitly
> > set
> > >> in the PCMK_logfile, it is confirmed that the resource-agents
> > log is
> > >> output to the file set in the PCMK_logfile.
> > > 
> > > Interesting ... the resource-agents library must look for
> > PCMK_logfile
> > > as well as HA_logfile. In that case, the easiest solution will be
> > for
> > > us to set PCMK_logfile explicitly in the shipped sysconfig file.
> > I can
> > > squeeze that into the soon-to-be-released 2.0.2 since it's not a
> > code
> > > change.
> > 
> > Solution remains the same, only meant to note that presence of
> > either:
> > 
> >   PCMK_logfile
> >   HA_logfile (likely on the way towards deprecation, preferably
> > avoid)

Yep, which brings up the question of what OCF should do. Currently
neither is part of the standard.

> > in the environment (from respective sysconfig/default/conf.d file
> > for
> > pacemaker) will trigger export of HA_LOGFILE environment variable
> > propagated subsequently towards the agent processes, and everything
> > then works as expected.  IOW. OCF and/or resource-agents are still
> > reasonably decoupled, thankfully.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Maintenance & Pacemaker Restart Demotes MS Resources

2019-06-05 Thread Ken Gaillot
On Wed, 2019-06-05 at 13:28 -0700, Dirk Gassen wrote:
> Thanks for your quick reply. I should have been a bit more verbose in
> my problem description.
> 
> After starting up pacemaker again and before "crm node testras3
> ready" I did actually monitor the cluster with "crm_mon" and waited
> until it indicated that it knew about the states of the resources.
> 
> Here is actually the excerpt from syslog:
> * crm node maintenance testras3
> > 16:14:50 On loss of CCM Quorum: Ignore
> > 16:14:50 Forcing unmanaged master MariaDB:0 to remain promoted on
> testras3
> > 16:14:50 Calculated Transition 12: /var/lib/pacemaker/pengine/pe-
> input-72.bz2
> * systemctl stop pacemaker
> > 16:15:29 On loss of CCM Quorum: Ignore
> > 16:15:29 Forcing unmanaged master MariaDB:0 to remain promoted on
> testras3

Ah, there is no master score for MariaDB, so when the node leaves
maintenance mode, the resource must be demoted.

Restarting pacemaker clears all transient node attributes (including
the master score). The next monitor would set it again, but maintenance
mode cancels monitors, so it won't run until it comes out of
maintenance mode, at which point it wants to do the demote.

A good way around this would be to unmanage the MariaDB resource before
putting the node in maintenance. When you take the node out of
maintenance, the monitor will start up again, but it won't take any
actions. Once the monitor runs and sets the master score (which you can
confirm with crm_master --query --resource MariaDB --node ), you
can manage the resource.

> > 16:15:29 Scheduling Node testras3 for shutdown
> > 16:15:29 Calculated Transition 13: /var/lib/pacemaker/pengine/pe-
> input-73.bz2
> > 16:15:29 Invoking handler for signal 15: Terminated
> * systemctl start pacemaker
> > 16:15:57 Additional logging available in /var/log/pacemaker.log
> > 16:16:20 On loss of CCM Quorum: Ignore
> > 16:16:20 Calculated Transition 0: /var/lib/pacemaker/pengine/pe-
> input-74.bz2
> > 16:16:20 On loss of CCM Quorum: Ignore
> > 16:16:20 Forcing unmanaged master MariaDB:0 to remain promoted on
> testras3
> > 16:16:20 Calculated Transition 1: /var/lib/pacemaker/pengine/pe-
> input-75.bz2
> * crm node ready testras3
> > 16:18:01 On loss of CCM Quorum: Ignore
> > 16:18:01 StopAppserverIP#011(testras3)
> > 16:18:01 Demote  MariaDB:0#011(Master -> Slave testras3)
> > 16:18:01 Calculated Transition 2: /var/lib/pacemaker/pengine/pe-
> input-76.bz2
> > 16:18:01 On loss of CCM Quorum: Ignore
> > 16:18:01 Start   AppserverIP#011(testras3)
> > 16:18:01 Promote MariaDB:0#011(Slave -> Master testras3)
> > 16:18:01 Calculated Transition 3: /var/lib/pacemaker/pengine/pe-
> input-77.bz2
> > 16:18:02 On loss of CCM Quorum: Ignore
> > 16:18:02 Calculated Transition 4: /var/lib/pacemaker/pengine/pe-
> input-78.bz2
> 
> So it looks like to me that the cluster is demoting ms_MariaDB from
> Master to Slave. I'm not sure if I should have waited for something
> else to occur?
> 
> I have attached pe-input-76.bz2.
> 
> Dirk
> 
> On Wed, Jun 5, 2019 at 10:22 AM Ken Gaillot 
> wrote:
> > On Wed, 2019-06-05 at 07:40 -0700, Dirk Gassen wrote:
> > > Hi,
> > > 
> > > I have the following CIB:
> > > > primitive AppserverIP IPaddr \
> > > > params ip=10.1.8.70 cidr_netmask=255.255.255.192
> > nic=eth0 \
> > > > op monitor interval=30s
> > > > primitive MariaDB mysql \
> > > > params binary="/usr/bin/mysqld_safe"
> > > pid="/var/run/mysqld/mysqld.pid"
> > socket="/var/run/mysqld/mysqld.sock"
> > > replication_user=repl replication_passwd="r3plic@tion"
> > > max_slave_lag=15 evict_outdated_slaves=false test_user=repl
> > > test_passwd="r3plic@tion" config="/etc/mysql/my.cnf" user=mysql
> > > group=mysql datadir="/opt/mysql" \
> > > > op monitor interval=27s role=Master OCF_CHECK_LEVEL=1 \
> > > > op monitor interval=35s timeout=30 role=Slave
> > > OCF_CHECK_LEVEL=1 \
> > > > op start interval=0 timeout=130 \
> > > > op stop interval=0 timeout=130
> > > > ms ms_MariaDB MariaDB \
> > > > meta master-max=1 master-node-max=1 clone-node-max=1
> > > notify=true globally-unique=false target-role=Started is-
> > managed=true
> > > > colocation colo_sm_aip inf: AppserverIP:Started
> > ms_MariaDB:Master
> > > 
> > > When I do "crm node testras3 maintenance && systemctl stop
> > pacemaker
> > &g

[ClusterLabs] Pacemaker 2.0.2 final release now available

2019-06-06 Thread Ken Gaillot


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pacemaker 2.0.2 final release now available

2019-06-06 Thread Ken Gaillot
On Thu, 2019-06-06 at 10:12 -0500, Ken Gaillot wrote:
> 

While I appreciate brevity, this was my e-mail client eating a draft.
:-/

Source code for the Pacemaker 2.0.2 and 1.1.21 releases is now
available:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.2

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.21

This is primarily a security and bug fix release, with stricter two-way
authentication of inter-process communication. The most significant
issue this fixes is a privilege escalation vulnerability allowing an
attacker with login access on a node to use an impostor pacemaker
subdaemon to gain root privileges if pacemaker is started after the
impostor.

The 2.0.2 release also has a few small features:

* crm_resource --validate can now be run using resource parameters from
the command line rather than the CIB, so configurations can be tested
before trying to add them

* crm_resource --clear now prints out any cleared constraints, so you
know when it did something

* A new HealthIOWait resource agent is available for tracking node
health based on CPU I/O wait

* A couple of experimental features discussed earlier on this list: a
new tool crm_rule can check for rule expiration, and stonith_admin now
supports XML output for easier machine parsing.

For more details about changes in this release, please see the change
logs:

https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-2.0.2/ChangeLog

https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-1.1.21/ChangeLog

Many thanks to all contributors of source code to this release,
including Chris Lumens, Gao,Yan, Jan Pokorný, Jehan-Guillaume de
Rorthais, Ken Gaillot, Klaus Wenninger, and Maciej Sobkowiak.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.0.2-rc3 now available

2019-05-30 Thread Ken Gaillot
Source code for the third (and likely final) release candidate for
Pacemaker version 2.0.2 is now available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.2-rc3

This fixes regressions found in rc2. I expect this will become the
final release next week. For details, please see the change log:

https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-2.0.2-rc3/ChangeLog

Everyone is encouraged to download, compile and test the new release.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Resource-agents log is not output to /var/log/pacemaker/pacemaker.log on RHEL8

2019-05-30 Thread Ken Gaillot
On Wed, 2019-05-29 at 09:23 -0500, Ken Gaillot wrote:
> On Wed, 2019-05-29 at 16:53 +0900, 飯田雄介 wrote:
> > Hi Ken and Jan,
> > 
> > Thank you for your comment.
> > 
> > I understand that solusion is to set PCMK_logfile in the sysconfig
> > file.
> > 
> > As a permanent fix, if you use the default values inside Pacemaker,
> > how about setting environment variables using set_daemon_option()
> > there?
> 
> That would be better. I was just going to change the shipped
> sysconfig
> because it's easy to do immediately, but changing the code would
> handle
> cases where users auto-generate a sysconfig that doesn't include it,
> launch pacemaker manually for testing, etc. However that'll have to
> wait for the next release.

Hi,

Since this is a regression in 2.0.0, and the change was small, I
decided to include it in 2.0.2-rc3 after all. Thanks for investigating
and reporting the issue!

> > For example, as PCMK_logficility does.
> > 
> 
> 
https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-2.0.2-rc2/lib/common/logging.c#L806
> > 
> > BTW, Pacemaker writes to /var/log/pacemaker/pacemaker.log via
> > libqb.
> > RA writes to this file with echo redirect.
> > If writing occurs at the same time, is there a risk that the file
> > may
> > be corrupted or the written log may disappear?
> > I have never actually had a problem, but I'm interested in how this
> > might happen.
> > 
> > Regards,
> > Yusuke
> > 
> > 2019年5月28日(火) 23:56 Jan Pokorný :
> > > On 28/05/19 09:29 -0500, Ken Gaillot wrote:
> > > > On Mon, 2019-05-27 at 14:12 +0900, 飯田雄介 wrote:
> > > > > By the way, when /var/log/pacemaker/pacemaker.log is
> > > > > explicitly
> > > 
> > > set
> > > > > in the PCMK_logfile, it is confirmed that the resource-agents
> > > 
> > > log is
> > > > > output to the file set in the PCMK_logfile.
> > > > 
> > > > Interesting ... the resource-agents library must look for
> > > 
> > > PCMK_logfile
> > > > as well as HA_logfile. In that case, the easiest solution will
> > > > be
> > > 
> > > for
> > > > us to set PCMK_logfile explicitly in the shipped sysconfig
> > > > file.
> > > 
> > > I can
> > > > squeeze that into the soon-to-be-released 2.0.2 since it's not
> > > > a
> > > 
> > > code
> > > > change.
> > > 
> > > Solution remains the same, only meant to note that presence of
> > > either:
> > > 
> > >   PCMK_logfile
> > >   HA_logfile (likely on the way towards deprecation, preferably
> > > avoid)
> 
> Yep, which brings up the question of what OCF should do. Currently
> neither is part of the standard.
> 
> > > in the environment (from respective sysconfig/default/conf.d file
> > > for
> > > pacemaker) will trigger export of HA_LOGFILE environment variable
> > > propagated subsequently towards the agent processes, and
> > > everything
> > > then works as expected.  IOW. OCF and/or resource-agents are
> > > still
> > > reasonably decoupled, thankfully.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] info: mcp_cpg_deliver: Ignoring process list sent by peer for local node

2019-05-29 Thread Ken Gaillot
On Wed, 2019-05-29 at 17:28 +0100, lejeczek wrote:
> hi guys,
> 
> I have a 3-nodes cluster but one node is a freaking mystery to me. I
> see
> this:
> 
> May 29 17:21:45 [51617] rider.private pacemakerd: info:
> pcmk_cpg_membership:Node 3 still member of group pacemakerd
> (peer=rider.private, counter=0.2)
> May 29 17:21:45 [51617] rider.private pacemakerd: info:
> mcp_cpg_deliver:Ignoring process list sent by peer for local node
> May 29 17:21:45 [51617] rider.private pacemakerd: info:
> mcp_cpg_deliver:Ignoring process list sent by peer for local node

These are harmless and unrelated.

> and I wonder if it in any way relates to the fact that the node says:
> 
> $ crm_mon --one-shot
> Connection to cluster failed: Transport endpoint is not connected
> $ pcs status --all
> Error: cluster is not currently running on this node

What user are you running as? That's expected if the user isn't either
root or in the haclient group.

> 
> and:
> $ systemctl status -l pacemaker.service 
> ● pacemaker.service - Pacemaker High Availability Cluster Manager
>Loaded: loaded (/usr/lib/systemd/system/pacemaker.service;
> disabled; vendor preset: disabled)
>Active: active (running) since Wed 2019-05-29 17:21:45 BST; 7s ago
>  Docs: man:pacemakerd
>
> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html
>  Main PID: 51617 (pacemakerd)
> Tasks: 1
>Memory: 3.3M
>CGroup: /system.slice/pacemaker.service
>└─51617 /usr/sbin/pacemakerd -f
> 
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking
> existing pengine process (pid=51528)
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking
> existing lrmd process (pid=51542)
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking
> existing stonithd process (pid=51558)
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking
> existing attrd process (pid=51559)
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking
> existing cib process (pid=51560)
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking
> existing crmd process (pid=51566)
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Quorum
> acquired
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Node
> whale.private state is now member
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Node
> swir.private state is now member
> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Node
> rider.private state is now member
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-06-14 Thread Ken Gaillot
On Fri, 2019-06-14 at 18:27 +0200, Lentes, Bernd wrote:
> Hi,
> 
> i had that problem already once but still it's not clear for me what
> really happens.
> I had this problem some days ago:
> I have a 2-node cluster with several virtual domains as resources. I
> put one node (ha-idg-2) into standby, and two running virtual domains
> were migrated to the other node (ha-idg-1). The other virtual domains
> were already running on ha-idg-1.
> Since then the two virtual domains which migrated (vm_idcc_devel and
> vm_severin) start or stop every 15 minutes on ha-idg-1.
> ha-idg-2 resides in standby.
> I know that the 15 minutes interval is related to the "cluster-
> recheck-interval".
> But why are these two domains started and stopped ?
> I looked around much in the logs, checked the pe-input files, watched
> some graphs created by crm_simulate with dotty ...
> I always see that the domains are started and 15 minutes later
> stopped and 15 minutes later started ...
> but i don't see WHY. I would really like to know that.
> And why are the domains not started from the monitor resource
> operation ? It should recognize that the domain is stopped and starts
> it again. My monitor interval is 30 seconds.
> I had two errors pending concerning these domains, a failed migrate
> from ha-idg-1 to ha-idg-2, form some time before.
> Could that be the culprit ?
> 
> I still have all the logs from that time, if you need information
> just let me know.

Yes the logs and pe-input files would be helpful. It sounds like a bug
in the scheduler. What version of pacemaker are you running?

> 
> Thanks.
> 
> 
> Bernd
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Two node cluster goes into split brain scenario during CPU intensive tasks

2019-06-25 Thread Ken Gaillot
On Tue, 2019-06-25 at 11:06 +, Somanath Jeeva wrote:
> I have not configured fencing in our setup . However I would like to
> know if the split brain can be avoided when high CPU occurs. 

Fencing *is* the way to prevent split brain. If the nodes can't see
each other, one will power down the other, and be able to continue on.

Of course that doesn't address the root cause of the split, but it's
the only way the cluster can recover from a split.

Addressing the root cause, I'd first make sure corosync is running at
real-time priority (I forget the ps option, hopefully someone else can
chime in). Another possibility would be to raise the corosync token
timeout to allow for a greater time before a split is declared.
Finally, if the work causing the load is scheduled, you can schedule
the cluster for maintenance mode during the same time frame, so the
cluster will refrain from reacting to events until the end of the time.

> 
> With Regards
> Somanath Thilak J
> 
> -Original Message-
> From: Ken Gaillot  
> Sent: Monday, June 24, 2019 20:28
> To: Cluster Labs - All topics related to open-source clustering
> welcomed ; Somanath Jeeva <
> somanath.je...@ericsson.com>
> Subject: Re: [ClusterLabs] Two node cluster goes into split brain
> scenario during CPU intensive tasks
> 
> On Mon, 2019-06-24 at 08:52 +0200, Jan Friesse wrote:
> > Somanath,
> > 
> > > Hi All,
> > > 
> > > I have a two node cluster with multicast (udp) transport . The 
> > > multicast IP used in 224.1.1.1 .
> > 
> > Would you mind to give a try to UDPU (unicast)? For two node
> > cluster 
> > there is going to be no difference in terms of speed/throughput.
> > 
> > > 
> > > Whenever there is a CPU intensive task the pcs cluster goes into 
> > > split brain scenario and doesn't recover automatically . We have
> > > to
> 
> In addition to others' comments: if fencing is enabled, split brain
> should not be possible. Automatic recovery should work as long as
> fencing succeeds. With fencing disabled, split brain with no
> automatic recovery can definitely happen.
> 
> > > do a manual restart of services to bring both nodes online
> > > again. 
> > 
> > Before the nodes goes into split brain , the corosync log shows ,
> > > 
> > > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit
> > > List:
> > > 7c 7e
> > > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit
> > > List:
> > > 7c 7e
> > > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit
> > > List:
> > > 7c 7e
> > > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit
> > > List:
> > > 7c 7e
> > > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit
> > > List:
> > > 7c 7e
> > 
> > This is usually happening when:
> > - multicast is somehow rate-limited on switch side
> > (configuration/bad 
> > switch implementation/...)
> > - MTU of network is smaller than 1500 bytes and fragmentation is
> > not 
> > allowed -> try reduce totem.netmtu
> > 
> > Regards,
> >Honza
> > 
> > 
> > > May 24 15:51:42 server1 corosync[4745]:  [TOTEM ] A processor 
> > > failed, forming new configuration.
> > > May 24 16:41:42 server1 corosync[4745]:  [TOTEM ] A new
> > > membership
> > > (10.241.31.12:29276) was formed. Members left: 1 May 24 16:41:42 
> > > server1 corosync[4745]:  [TOTEM ] Failed to receive the leave 
> > > message. failed: 1
> > > 
> > > Is there any way we can overcome this or this may be due to any 
> > > multicast issues in the network side.
> > > 
> > > With Regards
> > > Somanath Thilak J
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > ___
> > > Manage your subscription:
> > > 
https://protect2.fireeye.com/url?k=cf120bda-9398df1b-cf124b41-863d9b
> > > cb726f-
> > > 716d821bbcb5bd46=1=https%3A%2F%2Flists.clusterlabs.org%2F
> > > mailman%2Flistinfo%2Fusers
> > > 
> > > ClusterLabs home: 
> > > 
https://protect2.fireeye.com/url?k=eb2ec5bb-b7a4117a-eb2e8520-863d9b
> > > cb726f-b47e1043056350cb=1=https%3A%2F%2F
> > > www.clusterlabs.org%2F
> > > 
> > 
> > ___
> > Manage your subscription:
> > 
https://protect2.fireeye.com/url?k=99a652fd-c52c863c-99a61266-863d9bcb
> > 726f-
> > 72abff69ac96d9a3=1=https%3A%2F%2Flists.clusterlabs.org%2Fmail
> > man%2Flistinfo%2Fusers
> > 
> > ClusterLabs home: 
> > 
https://protect2.fireeye.com/url?k=d77f0141-8bf5d580-d77f41da-863d9bcb
> > 726f-0762985c29a467ea=1=https%3A%2F%2Fwww.clusterlabs.org%2F
> 
> --
> Ken Gaillot 
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] iSCSI Target resource starts on both nodes -- despite my colocation constraint

2019-06-25 Thread Ken Gaillot
On Mon, 2019-06-24 at 14:45 -0500, Bryan K. Walton wrote:
> On Mon, Jun 24, 2019 at 12:02:59PM -0500, Ken Gaillot wrote:
> > > Jun 20 11:48:36 storage1 crmd[240695]:  notice: Transition 1
> > > (Complete=12, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> > > Source=/var/lib/pacemaker/pengine/pe-input-1054.bz2): Complete
> > > Jun 20 11:48:36 storage1 pengine[240694]:   error: Resource
> > > targetRHEVM is active on 2 nodes (attempting recovery)
> > 
> > This means that pacemaker found the target active on storage2
> > *without*
> > having scheduled it there -- so either something outside pacemaker
> > is
> > setting up the target, or (less likely) the agent is returning the
> > wrong status.
> 
> Thanks for the reply, Ken.  I can't figure out what might have caused
> these iSCSI targets to already be active.  They aren't configured in
> targetcli (outside of Pacemaker) and I have no scripts that do
> anything
> like that, dynamically.
> 
> I don't have a default-resource-stickiness value set.  Could that
> have
> caused the iSCSI targets to be brought up on the node that was being
> brought out of standby?

No stickiness wouldn't affect it.

Have you tried checking whether the target is really active before
bringing the node out of standby? That would narrow down whether the
issue is in pacemaker or earlier.

I'd double-check it isn't enabled in systemd. Another possibility is if
some other systemd service requires the target, systemd might start the
target even though it's not enabled (I believe the only way around that
would be to put the other service under cluster control as well).

> 
> Thanks!
> Bryan
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] FW: Fence agent definition under Centos7.6

2019-06-13 Thread Ken Gaillot
Maybe you weren't subscribed to the list when you posted? There was a
reply:

https://lists.clusterlabs.org/pipermail/users/2019-May/025847.html

On Thu, 2019-06-13 at 19:58 +, Michael Powell wrote:
> I’m basically re-posting this request again, since I’ve gotten no
> response over the last two weeks.  If someone can take pity on a
> newbie, I’d sure appreciate it.
>  
> In the interim, I’ve done some experiments, trying to use fence-
> ipmilan in lieu of the mgpstonith fence agent described in the
> previous e-mail.  Without going into a lot of details, the results
> have been unsatisfactory, so I’ve renewed my efforts to get the in-
> house mgpstonith fence agent to work.
>  
> I’m still not sure about the specific question of where the
> mgpstonith executable needs to reside.  By moving it from
> /usr/lib64/stonith/plugins/external to /usr/lib64/stonith/plugins,
> and /usr/sbin,  I was able to eliminate the “Unknown fence agent”
> error.   That said, the following commands produce the subsequent log
> error messages:
>  
> crm configure primitive mgraid-stonith stonith:mgpstonith \
> params hostlist="mgraid-canister" \
> meta requires=”quorum” \
> op monitor interval="0" timeout="20s" 
>  
> This produces the following messages to stderr:
>  
> ERROR: stonith:mgpstonith: got no meta-data, does this RA exist?
> ERROR: stonith:mgpstonith: got no meta-data, does this RA exist?
> ERROR: stonith:mgpstonith: no such resource agent
>  
>  
> What would be most helpful at this point is a full description of the
> Fence Agent API. 
>  
> Regards,
>   Michael Powell
>  
> From: Michael Powell 
> Sent: Friday, May 31, 2019 3:33 PM
> To: users@clusterlabs.org
> Subject: Fence agent definition under Centos7.6
>  
> Although I am personally a novice wrt cluster operation, several
> years ago my company developed a product that used Pacemaker.  I’ve
> been charged with porting that product to a platform running Centos
> 7.6.  The old product ran Pacemaker 1.1.13 and heartbeat.  For the
> most part, the transition to Pacemaker 1.1.19 and Corosync has gone
> pretty well, but there’s one aspect that I’m struggling with: fence-
> agents.
>  
> The old product used a fence agent developed in house to implement
> STONITH.  While it was no trouble to compile and install the code,
> named mgpstonith, I see lots of messages like the following in the
> system log –
>  
> stonith-ng[31120]:error: Unknown fence agent:
> external/mgpstonith
> stonith-ng[31120]:error: Agent external/mgpstonith not found or
> does not support meta-data: Invalid argument (22)
> stonith-ng[31120]:error: Could not retrieve metadata for fencing
> agent external/mgpstonith   
>  
> I’ve put debug messages in mgpstonith, and as they do not appear in
> the system log, I’ve inferred that it is in fact never executed.
>  
> Initially, I installed mgpstonith on /lib64/stonith/plugins/external,
> which is where it was located on the old product.  I’ve copied it to
> other locations, e.g. /usr/sbin, with no better luck.  I’ve searched
> the web and while I’ve found lots of information about using the
> available fence agents, I’ve not turned up any information on how to
> create one “from scratch”.
>  
> Specifically, I need to know where to put mgpstonith on the target
> system(s).  Generally, I’d appreciate a pointer to any
> documentation/specification relevant to writing code for a fence
> agent.
>  
> Thanks,
>   Michael
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] crmsh: Release 4.1.0

2019-06-21 Thread Ken Gaillot
On Fri, 2019-06-21 at 09:19 +, Diego Akechi wrote:
> Hello everyone!
> 
> I'm happy to announce the release of crmsh version 4.1.0.
> 
> This main version brings the crmsh migration to python 3, pacemaker
> 2.0
> compatibility and many bug fixes.

Congrats! Can we expect a talk at the ClusterLabs Summit? :)

BTW We're still aiming for the end of September for the summit, details
should be finalized soon.

> 
> Once again, we would like to thank you Krig and Xin for the hard work
> in
> getting this done, and also to all the other contributors that made
> these release possible.
> 
> There are some other changes in this release as well, see the
> ChangeLog for the complete list of changes:
> 
> * https://github.com/ClusterLabs/crmsh/blob/4.1.0/ChangeLog
> 
> The source code can be downloaded from Github:
> 
> * https://github.com/ClusterLabs/crmsh/releases/tag/4.1.0
> 
> Packaged versions of crmsh should be available shortly from your
> distribution of choice. Development packages for openSUSE Tumbleweed
> are available from the Open Build System, here:
> 
> * 
> http://download.opensuse.org/repositories/network:/ha-clustering:/Factory/
> 
> Archives of the tagged release:
> 
> * https://github.com/ClusterLabs/crmsh/archive/4.1.0.tar.gz
> * https://github.com/ClusterLabs/crmsh/archive/4.1.0.zip
> 
> As usual, a huge thank you to all contributors and users of crmsh!
> 
> 
> Diego Akechi
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Two node cluster goes into split brain scenario during CPU intensive tasks

2019-06-24 Thread Ken Gaillot
On Mon, 2019-06-24 at 08:52 +0200, Jan Friesse wrote:
> Somanath,
> 
> > Hi All,
> > 
> > I have a two node cluster with multicast (udp) transport . The
> > multicast IP used in 224.1.1.1 .
> 
> Would you mind to give a try to UDPU (unicast)? For two node cluster 
> there is going to be no difference in terms of speed/throughput.
> 
> > 
> > Whenever there is a CPU intensive task the pcs cluster goes into
> > split brain scenario and doesn't recover automatically . We have to

In addition to others' comments: if fencing is enabled, split brain
should not be possible. Automatic recovery should work as long as
fencing succeeds. With fencing disabled, split brain with no automatic
recovery can definitely happen.

> > do a manual restart of services to bring both nodes online again. 
> 
> Before the nodes goes into split brain , the corosync log shows ,
> > 
> > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit List:
> > 7c 7e
> > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit List:
> > 7c 7e
> > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit List:
> > 7c 7e
> > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit List:
> > 7c 7e
> > May 24 15:10:02 server1 corosync[4745]:  [TOTEM ] Retransmit List:
> > 7c 7e
> 
> This is usually happening when:
> - multicast is somehow rate-limited on switch side
> (configuration/bad 
> switch implementation/...)
> - MTU of network is smaller than 1500 bytes and fragmentation is not 
> allowed -> try reduce totem.netmtu
> 
> Regards,
>Honza
> 
> 
> > May 24 15:51:42 server1 corosync[4745]:  [TOTEM ] A processor
> > failed, forming new configuration.
> > May 24 16:41:42 server1 corosync[4745]:  [TOTEM ] A new membership
> > (10.241.31.12:29276) was formed. Members left: 1
> > May 24 16:41:42 server1 corosync[4745]:  [TOTEM ] Failed to receive
> > the leave message. failed: 1
> > 
> > Is there any way we can overcome this or this may be due to any
> > multicast issues in the network side.
> > 
> > With Regards
> > Somanath Thilak J
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> > 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] iSCSI Target resource starts on both nodes -- despite my colocation constraint

2019-06-24 Thread Ken Gaillot
storage/vmstorage.
> Jun 20 11:48:42 storage1 iSCSILogicalUnit(lunVMStorage)[42555]: INFO:
> Created LUN 1. Created LUN 1->1 mapping in node ACL iqn.1994-
> 05.com.redhat:84f0f7458c58 Created LUN 1->1 mapping in node ACL
> iqn.1994-05.com.redhat:3d066d1f423e
> Jun 20 11:48:42 storage1 iSCSILogicalUnit(lunVMStorage)[42555]:
> ERROR: No such NetworkPortal in configfs:
> /sys/kernel/config/target/iscsi/iqn.2019-
> 03.com.leepfrog:storage.vmstorage/tpgt_1/np/0.0.0.0:3260
> Jun 20 11:48:42 storage1 lrmd[240692]:  notice:
> lunVMStorage_start_0:42555:stderr [
> /usr/lib/ocf/resource.d/heartbeat/iSCSILogicalUnit: line 410:
> /sys/kernel/config/target/core/iblock_0/lunVMStorage/wwn/vpd_unit_ser
> ial: No such file or directory ]
> Jun 20 11:48:42 storage1 crmd[240695]:  notice: Result of start
> operation for lunVMStorage on storage1: 0 (ok)
> Jun 20 11:48:42 storage1 crmd[240695]:  notice: Initiating monitor
> operation lunVMStorage_monitor_1 locally on storage1
> Jun 20 11:48:42 storage1 crmd[240695]:  notice: Transition 2
> (Complete=18, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-error-7.bz2): Complete
> Jun 20 11:48:42 storage1 crmd[240695]:  notice: State transition
> S_TRANSITION_ENGINE -> S_IDLE
> 
> Here are the logs from storage2, which was coming back online:
> 
> Jun 20 11:48:36 storage2 crmd[22305]:  notice: Result of probe
> operation for ISCSIMillipedeIP on storage2: 7 (not running)
> Jun 20 11:48:36 storage2 crmd[22305]:  notice: Result of probe
> operation for ISCSICentipedeIP on storage2: 7 (not running)
> Jun 20 11:48:36 storage2 crmd[22305]:  notice: Result of probe
> operation for targetRHEVM on storage2: 0 (ok)
> Jun 20 11:48:36 storage2 crmd[22305]:  notice: Result of probe
> operation for targetVMStorage on storage2: 0 (ok)
> Jun 20 11:48:36 storage2 crmd[22305]:  notice: Result of probe
> operation for lunRHEVM on storage2: 7 (not running)
> Jun 20 11:48:36 storage2 crmd[22305]:  notice: Result of probe
> operation for lunVMStorage on storage2: 7 (not running)
> 
> 
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Minor regression in pacemaker 2.0.2

2019-06-12 Thread Ken Gaillot
Hi all,

A minor regression has been found in pacemaker 2.0.2:

"stonith_admin --list-targets" will not work with fence agents other
than fence_xvm. Apologies for not catching that before release. Anyone
compiling or packaging 2.0.2 from source is recommended to include the
commits from the following pull request:

  https://github.com/ClusterLabs/pacemaker/pull/1808

It includes the fix for that as well as other minor release follow-up
that doesn't affect end users.

I consider the regression minor because in the course of fixing it, I
found that stonith_admin --list-targets had a couple of pre-existing
issues making it not very useful. It could only be run successfully on
the node running the fence device (which I will fix separately), and
for most fence agents, which output targets as "name,alias", it would
run them together as "namealias" (which is fixed with this PR, as a
beneficial side effect).
-- 
Ken Gaillot 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] drbd could not start by pacemaker. strange limited root privileges?

2019-05-23 Thread Ken Gaillot
On Thu, 2019-05-23 at 13:21 +0200, László Neduki wrote:
> Hi,
> 
> (
> I sent a similar question from an other acount 3 days ago, but: 
> - I do not see it on the list. Maybe I should not see my own email? 

A DRBD message from govom...@gmail.com did make it to the list a week
ago. You should get your own emails from the list server, though your
own mail server or client might filter them.

> So I created a new account
> - I have additional infos (but no solution), so I rewrite the
> question
> )
> 
> pacemaker cannot start drbd9 resources. As I see, root has very
> limited privileges in the drbd resource agent, when it run by the
> pacemaker. I downloaded the latest pacemaker this week, and I
> compiled drbd9 rpms also. I hope, You can help me, I do not find the
> cause of this behaviour. Please see the below test cases:

I'm not a DRBD expert, but given the symptoms you describe, my first
thoughts would be that either the ocf:linbit:drbd agent you're using
isn't the right version for your DRBD version, or something like
SELinux is restricting access.

> 1. When I create Pacemaker DRBD resource I get errors
> # pcs resource create DrbdDB ocf:linbit:drbd drbd_resource=drbd_db op
> monitor interval=60s meta notify=true
> # pcs resource master DrbdDBClone DrbdDB master-max=1 master-node-
> max=1 clone-node-max=1 notify=true
> # pcs constraint location DrbdDBClone prefers node1=INFINITY
> # pcs cluster stop --all; pcs cluster start --all; pcs status
> 
> Failed Actions:
> * DrbdDB_monitor_0 on node1 'not installed' (5): call=6,
> status=complete, exitreason='DRBD kernel (module) not available?',
> last-rc-change='Thu May 23 09:54:09 2019', queued=0ms, exec=58ms
> * DrbdDB_monitor_0 on node2 'not installed' (5): call=6,
> status=complete, exitreason='DRBD kernel (module) not available?',
> last-rc-change='Thu May 23 10:00:22 2019', queued=0ms, exec=71ms
> 
> 2. when I try to start drbd_db by drbdadm directly, it works well:
> # modprobe drbd #on each node
> # drbdadm up drbd_db #on each node
> # drbdadm primary drbd_db
> # drbdadm status 
> it shows drbd_db is UpToDate on each node
> I also can promote and mount filesystem well
> 
> 3. When I use debug-start, it works fine (so the resource syntax
> sould be correct)
> # drbdadm status
> No currently configured DRBD found.
> # pcs resource debug-start DrbdDBMaster
> Error: unable to debug-start a master, try the master's resource:
> DrbdDB
> # pcs resource debug-start DrbdDB #on each node
> Operation start for DrbdDB:0 (ocf:linbit:drbd) returned: 'ok' (0)
> # drbdadm status
> it shows drbd_db is UpToDate on each node
> 
> 4. Pacemaker handle other resources well . If I set auto_promote=yes,
> and I start (but not promote) the drbd_db by drbdadm, then pacemaker
> can create filesystem on it well, and also the appserver, database
> resources. 
> 
> 5. The strangest behaviour for me. Root have very limited privileges
> whitin the drbd resource agent. If I write this line to the
> srbd_start() method of  /usr/lib/ocf/resource.d/linbit/drbd
> 
> ocf_log err "lados " $(whoami) $( ls -l /home/opc/tmp/modprobe2.trace
> ) $( do_cmd touch /home/opc/tmp/modprobe2.trace )
> 
> I got theese messeges in log, when I start the cluster
> 
> # tail -f /var/log/cluster/corosync.log | grep -A 8 -B 3 -i lados
> 
> ...
> May 21 15:35:12  drbd(DrbdDB)[31649]:ERROR: lados  root
> May 21 15:35:12 [31309] node1   lrmd:   notice:
> operation_finished:DrbdDB_start_0:31649:stderr [ ls: cannot
> access /home/opc/tmp/modprobe2.trace: Permission denied ]
> May 21 15:35:12 [31309] node1   lrmd:   notice:
> operation_finished:DrbdFra_start_0:31649:stderr [ touch: cannot
> touch '/home/opc/tmp/modprobe2.trace': Permission denied ]
> ...
> and also, when I try to strace the "modprobe -s drbd `$DRBDADM sh-
> mod-parms`" in drbd resource agent, I only see 1 line in the
> /root/modprobe2.trace. This meens for me:
> - root cannot trace the calls in drbdadm (even if root can strace
> drbdadm outside of pacemaker well)
> - root can write into files his own directory
> (/root/modprobe2.trace) 
> 
> 6. Opposit of previous test
> root has these privileges outside from pacamaker
> 
> # sudo su -
> # touch /home/opc/tmp/modprobe2.trace
> # ls -l /home/opc/tmp/modprobe2.trace
> -rw-r--r--. 1 root root 0 May 21 15:44 /home/opc/tmp/modprobe2.trace
> 
> 
> Thanks: lados.
> 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] FYI to anyone backporting the recent security fixes

2019-05-24 Thread Ken Gaillot
In case anyone is planning to backport only the recent security fixes
to an older pacemaker version, here is a list of all commits that are
relevant.

2.0 branch:

32ded3e0172e0fae89cf70965e1c0406c1db883b High: libservices: fix use-after-free 
wrt. alert handling
912f5d9ce983339e939e4cc55f27791f8c9baa18 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (0/4)
1148f45da977113dff588cdd1cfebb7a47760b32 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (1/4)
970736b1c7ad5c78cc5295a4231e546104d55893 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (2/4)
052e6045eea77685aabeed12c519c7c9eb9b5287 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (3/4)
d324e407c0e2695f405974d567d79eb91d0ee69a High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (4/4)
3ad7b2509d78f95b5dfc8fffc4d9a91be1da5113 Med: controld: fix possible NULL 
pointer dereference
bccf845261c6e69fc4e6bdb8cf4e630a4a4ec7a8 Log: libcrmcluster: improve CPG 
membership messages
7dda20dac25f07eae959ca25cc974ef2fa6daf02 Fix: libcrmcommon: avoid use-of-NULL 
when checking whether process is active
d9b0269d59a00329feb19b6e65b10a233a3dd414 Low: libcrmcommon: return proper code 
if testing pid is denied


1.1 branch:

f91a961112ec9796181b42aa52f9c36dfa3c6a99 High: libservices: fix use-after-free 
wrt. alert handling
ab44422fa955c2dff1ac1822521e7ad335d4aab7 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (0/4)
6888aaf3ad365ef772f8189c9958f58b85ec62d4 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (1/4)
904c53ea311fd6fae945a55202b0a7ccf3783465 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (2/4)
07a82c5c8f9d60989ea88c5a3cc316ee290ea784 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (3/4)
4d6f6e01b309cda7b3f8fe791247566d247d8028 High: pacemakerd vs. IPC/procfs 
confused deputy authenticity issue (4/4)
9dc38d81cb6e1967c368faed78de1927cabf06b3 Med: controld: fix possible NULL 
pointer dereference
83811e2115f5516a7faec2e653b1be3d58b35a79 Log: libcrmcluster: improve CPG 
membership messages
d0c12d98e01bc6228fc254456927d79a46554448 Fix: libcrmcommon: avoid use-of-NULL 
when checking whether process is active
c0e1cf579f57922cbe872d23edf144dd2206156b Low: libcrmcommon: return proper code 
if testing pid is denied
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Constant stop/start of resource in spite of interval=0

2019-05-20 Thread Ken Gaillot
ec5d7d7aa7d0bf35cf8
> > > > > was 94afff0ff7cfc62f7cb1d5bf5b4d83aa vs. now 
> > 
> > f2317cad3d54cec5d7d7aa7d0bf35cf8
> > > > > ...
> > > > > 
> > > > > However, after issuing "cibadmin --query --local", the whole
> > > > > flipping 
> > > > > stopped! :-) Thanks!
> > > > 
> > > > No, I was wrong - it still repeats every ~15mins. The diff
> > > > between two
> 
> cib 
> > > > xml dumps doesn't say much to me - I'm going to enable tracing.
> > > 
> > > I have attached the trace file created according to 
> > > http://blog.clusterlabs.org/blog/2013/pacemaker-logging.
> > > 
> > > What looks strange to me is that build_parameter_list() first
> > > rejects
> > > attributes, then accepts them:
> > > 
> > > trace   May 18 23:02:49 build_operation_update(787):0: Including
> > > additional
> > digests for ocf::local:tunnel
> > > trace   May 18 23:02:49 build_parameter_list(621):0: Rejecting
> > > name for 
> > 
> > unique
> > > trace   May 18 23:02:49 build_parameter_list(614):0: Attr id is
> > > unique
> > > trace   May 18 23:02:49 build_parameter_list(632):0: Adding attr
> > > id=0 to
> 
> the 
> > xml result
> > > trace   May 18 23:02:49 build_parameter_list(621):0: Rejecting
> > > src_ip for 
> > 
> > unique
> > > trace   May 18 23:02:49 build_parameter_list(621):0: Rejecting
> > > dst_ip for 
> > 
> > unique
> > > ...
> > > trace   May 18 23:02:49 calculate_xml_digest_v1(71):0: Sorting
> > > xml...
> > > trace   May 18 23:02:49 calculate_xml_digest_v1(73):0: Done
> > > trace   May 18 23:02:49 crm_md5sum(2102):0: Beginning digest of
> > > 22 bytes
> > > trace   May 18 23:02:49 crm_md5sum(2110):0: Digest 
> > 
> > 94afff0ff7cfc62f7cb1d5bf5b4d83aa.
> > > 
> > > and then:
> > > 
> > > trace   May 18 23:02:49 calculate_xml_digest_v1(83):0:
> > > digest:source   
> > 
> > 
> > > trace   May 18 23:02:49 append_restart_list(693):0: tunnel-
> > > eduroam: 
> > 
> > 94afff0ff7cfc62f7cb1d5bf5b4d83aa,  id 
> > > trace   May 18 23:02:49 append_restart_list(694):0: restart
> > > digest source  
> > 
> > > trace   May 18 23:02:49 build_parameter_list(621):0: Rejecting
> > > name for 
> > 
> > private
> > > trace   May 18 23:02:49 build_parameter_list(625):0: Inverting
> > > name match 
> > 
> > for private xml
> > > trace   May 18 23:02:49 build_parameter_list(632):0: Adding attr 
> > 
> > name=eduroam IPv4 tunnel to the xml result
> > > trace   May 18 23:02:49 build_parameter_list(621):0: Rejecting id
> > > for 
> > 
> > private
> > > trace   May 18 23:02:49 build_parameter_list(625):0: Inverting id
> > > match for
> > private xml
> > > trace   May 18 23:02:49 build_parameter_list(632):0: Adding attr
> > > id=0 to
> 
> the 
> > xml result
> > > 
> > > 
> > > By the way, it's debian stretch with pacemaker 1.1.16-1.
> > 
> > I have double and triple checked the agent and it seems just a
> > normal, 
> > working agent.
> > 
> > The agent accepts the reload operation, it is advertised in the
> > actions 
> > section of its metadata, there are parameters with unique set to 0
> > and 
> > still stop/start is called instead of reload. (I could even live
> > with 
> > reload instead of start/stop in every 15 mins).
> > 
> > As a desperate attempt, I deleted the resource and re-added and it
> > of 
> > course did not help.
> > 
> > I also created the attached trace file during creating the resource
> > in the 
> > hope that it could help find the reason of the permanent
> > stop/start.
> > 
> > Best regards,
> > Jozsef
> > --
> > E-mail : kadlecsik.joz...@wigner.mta.hu 
> > PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt 
> > Address: Wigner Research Centre for Physics, Hungarian Academy of
> > Sciences
> >  H-1525 Budapest 114, POB. 49, Hungary
> 
> 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker 2.0.2-rc2 now available

2019-05-21 Thread Ken Gaillot
Source code for the second (and possibly final) release candidate for
Pacemaker version 2.0.2 is now available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.2-rc2

This fixes a few memory issues found in rc1. If no issues are found in
this one in a week or so, I'll release it as final. For more details
about changes in this release, please see the change log:

https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-2.0.2-rc2/ChangeLog

Everyone is encouraged to download, compile and test the new release.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

A 1.1.21-rc1 with selected backports from the 2.0.2 release candidates
will also be released soon.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-05-16 Thread Ken Gaillot
On Thu, 2019-05-16 at 10:20 +0200, Jehan-Guillaume de Rorthais wrote:
> On Wed, 15 May 2019 16:53:48 -0500
> Ken Gaillot  wrote:
> 
> > On Wed, 2019-05-15 at 11:50 +0200, Jehan-Guillaume de Rorthais
> > wrote:
> > > On Mon, 29 Apr 2019 19:59:49 +0300
> > > Andrei Borzenkov  wrote:
> > >   
> > > > 29.04.2019 18:05, Ken Gaillot пишет:  
> > > > > >
> > > > > > > Why does not it check OCF_RESKEY_CRM_meta_notify?
> > > > > > 
> > > > > > I was just not aware of this env variable. Sadly, it is not
> > > > > > documented
> > > > > > anywhere :(
> > > > > 
> > > > > It's not a Pacemaker-created value like the other notify
> > > > > variables --
> > > > > all user-specified meta-attributes are passed that way. We do
> > > > > need to
> > > > > document that.
> > > > 
> > > > OCF_RESKEY_CRM_meta_notify is passed also when "notify" meta-
> > > > attribute
> > > > is *not* specified, as well as a couple of others. But not
> > > > all   
> > 
> > Hopefully in that case it's passed as false? I vaguely remember
> > some
> > case where clone attributes were mistakenly passed to non-clone
> > resources, but I think notify is always accurate for clone
> > resources.
> 
> [1]
> 
> > > > possible
> > > > attributes. And some OCF_RESKEY_CRM_meta_* variables that are
> > > > passed do
> > > > not correspond to any user settable and documented meta-
> > > > attribute,
> > > > like
> > > > OCF_RESKEY_CRM_meta_clone.  
> > > 
> > > Sorry guys, now I am confused.  
> > 
> > A well-known side effect of pacemaker ;)
> > 
> > > Is it safe or not to use OCF_RESKEY_CRM_meta_notify? You both
> > > doesn't
> > > seem to
> > > agree where it comes from. Is it only a non expected side effect
> > > or
> > > is it safe
> > > and stable code path in Pacemaker we can rely on?  
> > 
> > It's reliable. All user-specified meta-attributes end up as
> > environment
> > variables 
> 
> OK...
> 
> > -- it's just meta-attributes that *aren't* specified by the
> > user that may or may not show up
> 
> OK...
> 
> > (but hopefully with the correct value).
> 
> And that's where I am now loosing some confidence about this
> environment vars :)
> "Hopefully" and "I think is accurate" ([1]) are quite scary to me :/

It looks perfectly reliable to me :) but Andrei's comments make me want
more information.

If I understand correctly, he's saying that the presence of the notify
variable is unreliable. That's fine if the option is not specified by
the user, and the variable is either not present or present as false.
But it would indicate a bug if the variable is not present when the
option *is* specified by the user, or if the variable is present as
true when the option is not specified by the user.

Personally I'd rely on it.

The controller gets the environment variable values from the
 entries in the scheduler's result. We have numerous
examples in the scheduler regression test data, typically installed
under /usr/share/pacemaker/tests in scheduler/*.exp (for 2.0) or
pengine/test10/*.exp (for 1.1).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Constant stop/start of resource in spite of interval=0

2019-05-20 Thread Ken Gaillot
On Mon, 2019-05-20 at 23:15 +0200, Kadlecsik József wrote:
> Hi,
> 
> On Mon, 20 May 2019, Ken Gaillot wrote:
> 
> > On Mon, 2019-05-20 at 15:29 +0200, Ulrich Windl wrote:
> > > What worries me is "Rejecting name for unique".
> > 
> > Trace messages are often not user-friendly. The rejecting/accepting
> > is 
> > nothing to be concerned about; it just refers to which parameters
> > are 
> > being used to calculate that particular hash.
> > 
> > Pacemaker calculates up to three hashes.
> > 
> > The first is a hash of all the resource parameters, to detect if
> > anything changed; this is stored as "op-digest" in the CIB status
> > entries.
> > 
> > If the resource is reloadable, another hash is calculated with just
> > the
> > parameters marked as unique=1 (which means they can't be reloaded).
> > Any
> > change in these parameters requires a full restart. This one is
> > "op-
> > restart-digest".
> > 
> > Finally, if the resource has sensitive parameters like passwords, a
> > hash of everything but those parameters is stored as "op-secure-
> > digest". This one is only used when simulating CIBs grabbed from
> > cluster reports, which have sensitive info scrubbed.
> 
> Thanks for the explanation! It seemed very cryptic in the trace
> messages 
> that different hashes were calculated with differen parameter lists.
>  
> > From what's described here, the op-restart-digest is changing every
> > time, which means something is going wrong in the hash comparison
> > (since the definition is not really changing).
> > 
> > The log that stands out to me is:
> > 
> > trace   May 18 23:02:49 calculate_xml_digest_v1(83):0:
> > digest:source   
> > 
> > The id is the resource name, which isn't "0". That leads me to:
> > 
> > trace   May 18 23:02:49 svc_read_output(87):0: Got 499 chars:
> > 
> > 
> > which is the likely source of the problem. "id" is a pacemaker
> > property, 
> > not an OCF resource parameter. It shouldn't be in the resource
> > agent 
> > meta-data. Remove that, and I bet it will be OK.
> 
> I renamed the parameter to "tunnel_id", redefined the resources and 
> started them again.
>  
> > BTW the "every 15 minutes" would be the cluster-recheck-interval
> > cluster property.
> 
> I have waited more than half an hour and there are no more 
> stopping/starting of the resources. :-) I haven't thought that "id"
> is 
> reserved as parameter name.

It isn't, by the OCF standard. :) This could be considered a pacemaker
bug; pacemaker should be able to distinguish its own "id" from an OCF
parameter "id", but it currently can't.

> 
> Thank you!
> 
> Best regards,
> Jozsef
> --
> E-mail : kadlecsik.joz...@wigner.mta.hu
> PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
> Address: Wigner Research Centre for Physics, Hungarian Academy of
> Sciences
>  H-1525 Budapest 114, POB. 49, Hungary
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Fwd: Postgres pacemaker cluster failure

2019-04-29 Thread Ken Gaillot
 ( lc($ans) =~ /^true$|^on$|^yes$|^y$|^1$/ ) {
> > ocf_exit_reason(
> > 'You must set meta parameter notify=true for your
> > master
> > resource'
> > );
> > exit $OCF_ERR_INSTALLED;
> > }
> > 
> > but that is wrong - "notify" is set on ms definition, while
> > $OCF_RESOURCE_INSTANCE refers to individual clone member. There is
> > no
> > notify option on PGSQL primitive.
> 
> Interesting...and disturbing. I wonder why I never faced a bug
> related to this
> after so many tests in various OS and a bunch of running clusters in
> various
> environments. Plus, it hasn't been reported sooner by anyone.
> 
> Is it possible the clone members inherit this from the master
> definition or
> "crm_resource" to look at this higher level?

That's correct. For clone/master/group/bundle resources, setting meta-
attributes on the collective resource makes them effective for the
inner resources as well. So I don't think that's causing any issues
here.

> If I set a meta attribute at master level, it appears on clones as
> well:
> 
>   > crm_resource --resource pgsql-ha --meta --get-parameter=clone-max
>   pgsql-ha is active on more than one node, returning the default
> value for
>   clone-max 
>   Attribute 'clone-max' not found for 'pgsql-ha' 
>   Error performing operation: No such device or address
> 
>   > crm_resource --resource pgsqld --meta --get-parameter=clone-max
>   Attribute 'clone-max' not found for 'pgsqld:0'
>   Error performing operation: No such device or address
> 
>   > crm_resource --resource=pgsql-ha --meta --set-parameter=clone-max 
> \
> --parameter-value=3
> 
>   Set 'pgsql-ha' option: id=pgsql-ha-meta_attributes-clone-max
>   set=pgsql-ha-meta_attributes name=clone-max=3
> 
>   > crm_resource --resource pgsql-ha --meta --get-parameter=clone-max
>   pgsql-ha is active on more than one node, returning the default
> value for
>   clone-max 
>   3
> 
>   > crm_resource --resource pgsqld --meta --get-parameter=clone-max
>   3
> 
> If this behavior is not expected, maybe Danka's Pacemaker versions
> act
> differently because of this?
> 
> > Why does not it check OCF_RESKEY_CRM_meta_notify?
> 
> I was just not aware of this env variable. Sadly, it is not
> documented
> anywhere :(

It's not a Pacemaker-created value like the other notify variables --
all user-specified meta-attributes are passed that way. We do need to
document that.

> 
> I'll do some tests with it. It will save a call to crm_resource and
> all the
> machinery and sounds safer...
> 
> Thanks for the hint!
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Inconsistent clone $OCF_RESOURCE_INSTANCE value depending on symmetric-cluster property.

2019-04-29 Thread Ken Gaillot
On Sat, 2019-04-27 at 10:27 +0300, Andrei Borzenkov wrote:
> Documentation says for clone resources OCF_RESOURCE_INSTANCE contains
> primitive qualified by instance number, like primitive:1.

That is pacemaker's practice (inherited from heartbeat).

The OCF standard itself says the variable contains "the name of the
resource instance," where the name "must be unique to the particular
resource type" and "is any name chosen by the administrator to identify
the resource instance."

The OCF standard was originally written without clones in mind, so it's
a grey area.

> I was rather surprised that pacemaker may actually omit qualification
> at
> least in the following case:
> 
> 1. *start* pacemaker with symmetric-cluster=false
> 2. do not add constraints allowing primitive in clone definition run
> anywhere
> 3. try to start clone
> 
> Resource agents get simple "primitive" for OCF_RESOURCE_INSTANCE
> instead
> of "primitive:1".

I'm confused -- how is the resource agent being called if there are no
constraints enabling it? Maybe for probes?

> Moreover, if now I set symmetric-cluster=true, pacemaker *continues*
> to
> provide OCF_RESOURCE_INSTANCE without qualification!
> 
> If I *start* pacemaker with symmetric-cluster=true (default)
> pacemaker
> provides qualified OCF_RESOURCE_INSTANCE and *continues* to do so
> even
> after I set symmetric-cluster=false. Until next pacemaker restart.
> 
> node 1: ha1 \
>   attributes master-m_Stateful=10
> node 2: ha2
> primitive A Dummy \
>   op start interval=0 \
>   op_params interval=0
> primitive B Dummy \
>   op start interval=0 \
>   op_params interval=0 \
>   meta target-role=Stopped
> primitive fence_disk stonith:fence_scsi \
>   params devices="/dev/sdb"
> primitive p_Stateful ocf:_local:Stateful_Test_1 \
>   op start interval=0
> ms m_Stateful p_Stateful \
>   meta target-role=Stopped
> location A-ha1 A 50: ha1
> location A-ha2 A 30: ha2
> location B-ha2 B 3: ha2
> colocation B-with-A -inf: B A
> property cib-bootstrap-options: \
>   dc-version="2.0.1+20190304.9e909a5bd-1.1-
> 2.0.1+20190304.9e909a5bd" \
>   cluster-infrastructure=corosync \
>   stonith-enabled=true \
>   last-lrm-refresh=1551115646 \
>   have-watchdog=false \
>   symmetric-cluster=false
> 
> And after trying to start m_Stateful
> 
> OCF_RESOURCE_INSTANCE=Stateful_Test_1
> OCF_RESOURCE_INSTANCE=p_Stateful
> 
> 
> Now delete symmetric-cluster
> 
> 
> node 1: ha1 \
>   attributes master-m_Stateful=10
> node 2: ha2
> primitive A Dummy \
>   op start interval=0 \
>   op_params interval=0
> primitive B Dummy \
>   op start interval=0 \
>   op_params interval=0 \
>   meta target-role=Stopped
> primitive fence_disk stonith:fence_scsi \
>   params devices="/dev/sdb"
> primitive p_Stateful ocf:_local:Stateful_Test_1 \
>   op start interval=0
> ms m_Stateful p_Stateful \
>   meta target-role=Started
> location A-ha1 A 50: ha1
> location A-ha2 A 30: ha2
> location B-ha2 B 3: ha2
> colocation B-with-A -inf: B A
> property cib-bootstrap-options: \
>   dc-version="2.0.1+20190304.9e909a5bd-1.1-
> 2.0.1+20190304.9e909a5bd" \
>   cluster-infrastructure=corosync \
>   stonith-enabled=true \
>   last-lrm-refresh=1551115646 \
>   have-watchdog=false
> 
> And try to start m_Stateful again
> 
>  meta-data
> OCF_RESOURCE_INSTANCE=Stateful_Test_1
>  start
> OCF_RESOURCE_INSTANCE=p_Stateful
>  promote
> OCF_RESOURCE_INSTANCE=p_Stateful
> 
> 
> In case I miss something obvious - is it intentional? If no, should I
> open bug report?

I don't think it's intentional. However, the instance number *should*
be irrelevant to the resource agent for anonymous clones. I would
consider it a bug if it's missing for a unique clone, but not if it
only happens for anonymous clones.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Failover event not reported correctly?

2019-04-18 Thread Ken Gaillot
On Thu, 2019-04-18 at 15:51 -0600, JCA wrote:
> I have my CentOS two-node cluster, which some of you may already be
> sick and tired of reading about:
> 
> # pcs status
> Cluster name: FirstCluster
> Stack: corosync
> Current DC: two (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition
> with quorum
> Last updated: Thu Apr 18 13:52:38 2019
> Last change: Thu Apr 18 13:50:57 2019 by root via cibadmin on one
> 
> 2 nodes configured
> 5 resources configured
> 
> Online: [ one two ]
> 
> Full list of resources:
> 
>  MyCluster  (ocf::myapp:myapp-script):  Started two
>  Master/Slave Set: DrbdDataClone [DrbdData]
>  Masters: [ two ]
>  Slaves: [ one ]
>  DrbdFS (ocf::heartbeat:Filesystem):  Started two
>  disk_fencing (stonith:fence_scsi): Started one
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> I can stop either node, and the other will take over as expected.
> Here is the thing though:
> 
> myapp-script starts, stops and monitors the actual application that I
> am interested in. I'll call this application A. At the OS level, A is
> of course listed when I do ps awux. 
> 
> In the situation above, where A is running on two, I can kill A from
> the CentOS command line in two. Shortly after doing so, Pacemaker
> invokes myapp-script in two, in the following ways and returning the
> following values:
> 
>monitor: OCF_NOT_RUNNING
>stop: OCF_SUCCESS
>start: OCF_SUCCESS 
>monitor: OCF_SUCCESS
>  
> After this, with ps auwx in two I can see that A is indeed up and
> running. However, the output from pcs status (in either one or two)
> is now the following:
> 
> Cluster name: FirstCluster
> Stack: corosync
> Current DC: two (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition
> with quorum
> Last updated: Thu Apr 18 15:21:25 2019
> Last change: Thu Apr 18 13:50:57 2019 by root via cibadmin on one
> 
> 2 nodes configured
> 5 resources configured
> 
> Online: [ one two ]
> 
> Full list of resources:
> 
>  MyCluster  (ocf::myapp:myapp-script):  Started two
>  Master/Slave Set: DrbdDataClone [DrbdData]
>  Masters: [ two ]
>  Slaves: [ one ]
>  DrbdFS (ocf::heartbeat:Filesystem):  Started two
>  disk_fencing (stonith:fence_scsi): Started one
> 
> Failed Actions:
> * MyCluster_monitor_3 on two 'not running' (7): call=35,
> status=complete, exitreason='',
> last-rc-change='Thu Apr 18 15:21:12 2019', queued=0ms, exec=0ms
> 
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> And the cluster seems to stay stuck there, until I stop and start
> node two explicitly.
> 
>Is this the expected behavior? What I was expecting is for
> Pacemaker to restart A, in either node - which it indeed does, in two
> itself. But pcs status seems to think that an error happened when
> trying to restart A - despite the fact that it got A restarted all
> right. And I know that A is running correctly to boot.
> 
>What am I misunderstanding here?

You got everything right, except the display is not saying the restart
failed -- it's saying there was a monitor failure that led to the
restart. The "failed actions" section is a history rather than the
current status (which is the "full cluster status" section).

The idea is that failures might occur when you're not looking :) and
you can see that they happened the next time you check the status, even
if the cluster was able to recover successfully.

To clear the history, run "crm_resource -C -r MyCluster" (or "pcs
resource cleanup MyCluster" if you're using pcs).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pacemaker detail log directory permissions

2019-04-26 Thread Ken Gaillot
On Thu, 2019-04-25 at 18:49 +0200, Jan Pokorný wrote:
> On 24/04/19 09:32 -0500, Ken Gaillot wrote:
> > On Wed, 2019-04-24 at 16:08 +0200, wf...@niif.hu wrote:
> > > Make install creates /var/log/pacemaker with mode 0770, owned by
> > > hacluster:haclient.  However, if I create the directory as
> > > root:root
> > > instead, pacemaker.log appears as hacluster:haclient all the
> > > same.  What breaks in this setup besides log rotation (which can
> > > be
> > > fixed by removing the su directive)?  Why is it a good idea to
> > > let
> > > the haclient group write the logs?
> > 
> > Cluster administrators are added to the haclient group. It's a
> > minor
> > use case, but the group write permission allows such users to run
> > commands that log to the detail log. An example would be running
> > "crm_resource --force-start" for a resource agent that writes debug
> > information to the log.
> 
> I think the prime and foremost use case is that half of the actual
> pacemaker daemons run as hacluster:haclient themselves, and it's
> preferred for them to be not completely muted about what they do,
> correct? :-)

The logs are owned by hacluster user, so the daemons don't have an
issue.

> 
> Indeed, users can configure whatever log routing they desire
> (I was actually toying with an idea to make it a lot more flexible,
> log-per-type-of-daemon and perhaps even distinguished by PID,
> configurable log formats since currently it's arguably a heavy
> overkill to keep the hostname stated repeatedly over and over
> without actually bothering to recheck it from time to time, etc.).
> 
> Also note, relying on almighty root privileges (like with the
> pristine
> deployment) is a silent misconception that cannot be taken for fully
> granted, so again arguably, even the root daemons should take
> a haclient group's coat on top of their own just in case [*].
> 
> > If ACLs are not in use, such users already have full read/write
> > access to the CIB, so being able to read and write the log is not
> > an
> > additional concern.
> > 
> > With ACLs, I could see wanting to change the permissions, and that
> > idea
> > has come up already. One approach might be to add a PCMK_log_mode
> > option that would default to 0660, and users could make it more
> > strict
> > if desired.
> 
> It looks reasonable to prevent read-backs by anyone but root, that
> could be applied without any further toggles, assuming the pacemaker
> code won't flip once purposefully allowed read bits for group back
> automatically and unconditionally.

Pacemaker does indeed ensure the detail log has specific ownerships and
permissions -- see crm_add_logfile().

> 
> [*] for instance when SELinux hits hard (which is currently not the
> case for Fedora/EL family), even though the executor(s) would
> need
> to be exempted if process inheritance taints the tree once
> forever:
> https://danwalsh.livejournal.com/69478.html
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-07-05 Thread Ken Gaillot
On Fri, 2019-07-05 at 13:07 +0200, Lentes, Bernd wrote:
> 
> - On Jul 4, 2019, at 1:25 AM, kgaillot kgail...@redhat.com wrote:
> 
> > On Wed, 2019-06-19 at 18:46 +0200, Lentes, Bernd wrote:
> > > - On Jun 15, 2019, at 4:30 PM, Bernd Lentes
> > > bernd.len...@helmholtz-muenchen.de wrote:
> > > 
> > > > - Am 14. Jun 2019 um 21:20 schrieb kgaillot 
> > > > kgail...@redhat.com
> > > > :
> > > > 
> > > > > On Fri, 2019-06-14 at 18:27 +0200, Lentes, Bernd wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > i had that problem already once but still it's not clear
> > > > > > for me
> > > > > > what
> > > > > > really happens.
> > > > > > I had this problem some days ago:
> > > > > > I have a 2-node cluster with several virtual domains as
> > > > > > resources. I
> > > > > > put one node (ha-idg-2) into standby, and two running
> > > > > > virtual
> > > > > > domains
> > > > > > were migrated to the other node (ha-idg-1). The other
> > > > > > virtual
> > > > > > domains
> > > > > > were already running on ha-idg-1.
> > > > > > Since then the two virtual domains which migrated
> > > > > > (vm_idcc_devel and
> > > > > > vm_severin) start or stop every 15 minutes on ha-idg-1.
> > > > > > ha-idg-2 resides in standby.
> > > > > > I know that the 15 minutes interval is related to the
> > > > > > "cluster-
> > > > > > recheck-interval".
> > > > > > But why are these two domains started and stopped ?
> > > > > > I looked around much in the logs, checked the pe-input
> > > > > > files,
> > > > > > watched
> > > > > > some graphs created by crm_simulate with dotty ...
> > > > > > I always see that the domains are started and 15 minutes
> > > > > > later
> > > > > > stopped and 15 minutes later started ...
> > > > > > but i don't see WHY. I would really like to know that.
> > > > > > And why are the domains not started from the monitor
> > > > > > resource
> > > > > > operation ? It should recognize that the domain is stopped
> > > > > > and
> > > > > > starts
> > > > > > it again. My monitor interval is 30 seconds.
> > > > > > I had two errors pending concerning these domains, a failed
> > > > > > migrate
> > > > > > from ha-idg-1 to ha-idg-2, form some time before.
> > > > > > Could that be the culprit ?
> > 
> > It did indeed turn out to be.
> > 
> > The resource history on ha-idg-1 shows the last failed action as a
> > migrate_to from ha-idg-1 to ha-idg-2, and the last successful
> > action as
> > a migrate_from from ha-idg-2 to ha-idg-1. That confused pacemaker
> > as to
> > the current status of the migration.
> > 
> > A full migration is migrate_to on the source node, migrate_from on
> > the
> > target node, and stop on the source node. When the resource history
> > has
> > a failed migrate_to on the source, and a stop but no migrate_from
> > on
> > the target, the migration is considered "dangling" and forces a
> > stop of
> > the resource on the source, because it's possible the migrate_from
> > never got a chance to be scheduled.
> > 
> > That is wrong in this situation. The resource is happily running on
> > the
> > node with the failed migrate_to because it was later moved back
> > successfully, and the failed migrate_to is no longer relevant.
> > 
> > My current plan for a fix is that if a node with a failed
> > migrate_to
> > has a successful migrate_from or start that's newer, and the target
> > node of the failed migrate_to has a successful stop, then the
> > migration
> > should not be considered dangling.
> > 
> > A couple of side notes on your configuration:
> > 
> > Instead of putting action=off in fence device configurations, you
> > should use pcmk_reboot_action=off. Pacemaker adds action when
> > sending
> > the fence command.
> 
> I did that already.
>  
> > When keeping a fence device off its target node, use a finite
> > negative
> > score rather than -INFINITY. This ensures the node can fence itself
> > as
> > a last resort.
> 
> I will do that.
> 
> Thanks for clarifying this, it happened very often.
> I conclude that it's very important to cleanup a resource failure
> quickly after finding the cause
> and solving the problem, not having any pending errors.

This is the first bug I can recall that was triggered by an old
failure, so I don't think it's important as a general policy outside of
live migrations.

I've got a fix I'll merge soon.

> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep,
> Heinrich Bassler, Kerstin Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] colocation - but do not stop resources on failure

2019-07-10 Thread Ken Gaillot
On Wed, 2019-07-10 at 10:30 +0100, lejeczek wrote:
> On 09/07/2019 20:26, Ken Gaillot wrote:
> > On Tue, 2019-07-09 at 11:21 +0100, lejeczek wrote:
> > > hi guys,
> > > 
> > > how to, if possible, create colocation which would not stop
> > > dependent
> > > resources if the target(that would be systemd agent) resource
> > > fails
> > > on
> > > all nodes?
> > > 
> > > many thanks, L.
> > 
> > Sure, just use a finite score. Colocation is mandatory if the score
> > is
> > INFINITY (or -INFINITY for anti-colocation), otherwise it's a
> > preference rather than a requirement.
> 
> so simple, fantastic, many! thanks.
> 
> Can already existing constraint be changed/updated, maybe similarly
> to
> how resource can? I'm thumbing through man pages but fail to find an
> answer.

Yes, but it depends on what configuration interface you're using
(cibadmin / crm / pcs / etc.), so keep thumbing. ;) You want to change
the score on an existing constraint, or if the tool doesn't offer that,
drop the old one and add a new (you can do that in a batch file, or put
the cluster into maintenance mode first, to avoid any resource
shuffling while you're doing it).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-10 Thread Ken Gaillot
On Wed, 2019-07-10 at 11:26 +0100, lejeczek wrote:
> hi guys, possibly @devel if they pop in here.
> 
> is there, will there be, a way to make cluster deal with failed
> resources in such a way that cluster would try not to give up on
> failed
> resources?
> 
> I understand that as of now the only way is  user's manual
> intervention
> (under which I'd include any scripted ways outside of the cluster) if
> we
> need to bring back up a failed resource.
> 
> many thanks, L.

Not sure what you mean ... the default behavior is to try restarting a
failed resource up to 1,000,000 times on the same node, then try
starting it on a different node, and not give up until all nodes have
failed to start it.

This is affected by on-fail, migration-threshold, failure-timeout, and
start-failure-is-fatal.

If you're talking about a resource that failed because the entire node
failed, then fencing comes into play.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-11 Thread Ken Gaillot
On Thu, 2019-07-11 at 10:39 +0100, lejeczek wrote:
> On 10/07/2019 15:50, Ken Gaillot wrote:
> > On Wed, 2019-07-10 at 11:26 +0100, lejeczek wrote:
> > > hi guys, possibly @devel if they pop in here.
> > > 
> > > is there, will there be, a way to make cluster deal with failed
> > > resources in such a way that cluster would try not to give up on
> > > failed
> > > resources?
> > > 
> > > I understand that as of now the only way is  user's manual
> > > intervention
> > > (under which I'd include any scripted ways outside of the
> > > cluster) if
> > > we
> > > need to bring back up a failed resource.
> > > 
> > > many thanks, L.
> > 
> > Not sure what you mean ... the default behavior is to try
> > restarting a
> > failed resource up to 1,000,000 times on the same node, then try
> > starting it on a different node, and not give up until all nodes
> > have
> > failed to start it.
> > 
> > This is affected by on-fail, migration-threshold, failure-timeout,
> > and
> > start-failure-is-fatal.
> > 
> > If you're talking about a resource that failed because the entire
> > node
> > failed, then fencing comes into play.
> 
> Apologies for I was not clear enough while wording my question, I see
> that now. When I said - make cluster deal with failed resources - I
> meant a resource which failed in the (whole) cluster, failed on every
> node.
> 
> If that happens I see that only my (user manual) intervention can
> make
> cluster peep at the resource again and I wonder if this is me unaware
> that there are ways it can be done, that cluster will not need me and
> by
> itself would do something, will not give up.
> 
> My case is: a systemd resource which whether successful or not is
> determined by a mechanism outside of the cluster, it can only
> successfully start on one single node. When that node reboots then
> cluster fails this resource, when that node rebooted and is up again
> the
> failed resource remains in failed state.
> 
> Hopefully I manged to make it bit clearer this time.
> 
> Many thanks, L.

Ah, yes. failure-timeout is the only way to handle that. Keep in mind
it is not guaranteed to be checked more frequently than the cluster-
recheck-interval.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] failed resource resurection - failcount/cleanup etc ?

2019-07-12 Thread Ken Gaillot
On Fri, 2019-07-12 at 13:33 +0100, lejeczek wrote:
> On 11/07/2019 14:16, Ken Gaillot wrote:
> > On Thu, 2019-07-11 at 10:39 +0100, lejeczek wrote:
> > > On 10/07/2019 15:50, Ken Gaillot wrote:
> > > > On Wed, 2019-07-10 at 11:26 +0100, lejeczek wrote:
> > > > > hi guys, possibly @devel if they pop in here.
> > > > > 
> > > > > is there, will there be, a way to make cluster deal with
> > > > > failed
> > > > > resources in such a way that cluster would try not to give up
> > > > > on
> > > > > failed
> > > > > resources?
> > > > > 
> > > > > I understand that as of now the only way is  user's manual
> > > > > intervention
> > > > > (under which I'd include any scripted ways outside of the
> > > > > cluster) if
> > > > > we
> > > > > need to bring back up a failed resource.
> > > > > 
> > > > > many thanks, L.
> > > > 
> > > > Not sure what you mean ... the default behavior is to try
> > > > restarting a
> > > > failed resource up to 1,000,000 times on the same node, then
> > > > try
> > > > starting it on a different node, and not give up until all
> > > > nodes
> > > > have
> > > > failed to start it.
> > > > 
> > > > This is affected by on-fail, migration-threshold, failure-
> > > > timeout,
> > > > and
> > > > start-failure-is-fatal.
> > > > 
> > > > If you're talking about a resource that failed because the
> > > > entire
> > > > node
> > > > failed, then fencing comes into play.
> > > 
> > > Apologies for I was not clear enough while wording my question, I
> > > see
> > > that now. When I said - make cluster deal with failed resources -
> > > I
> > > meant a resource which failed in the (whole) cluster, failed on
> > > every
> > > node.
> > > 
> > > If that happens I see that only my (user manual) intervention can
> > > make
> > > cluster peep at the resource again and I wonder if this is me
> > > unaware
> > > that there are ways it can be done, that cluster will not need me
> > > and
> > > by
> > > itself would do something, will not give up.
> > > 
> > > My case is: a systemd resource which whether successful or not is
> > > determined by a mechanism outside of the cluster, it can only
> > > successfully start on one single node. When that node reboots
> > > then
> > > cluster fails this resource, when that node rebooted and is up
> > > again
> > > the
> > > failed resource remains in failed state.
> > > 
> > > Hopefully I manged to make it bit clearer this time.
> > > 
> > > Many thanks, L.
> > 
> > Ah, yes. failure-timeout is the only way to handle that. Keep in
> > mind
> > it is not guaranteed to be checked more frequently than the
> > cluster-
> > recheck-interval.
> 
> fantastic!
> 
> Is "cluster-recheck-interval" tough on the cluster? Is okey to take
> it
> down from default 15min?
> 
> thanks, L.

Certainly 5min is fine. I've seen users take it down as far as 1min,
although that makes me uneasy for no defined reason. It's not a lot of
overhead -- you can run "time crm_simulate -SL" to get an idea of what
it takes (plus increasing logs somewhat).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Problems with master/slave failovers

2019-07-02 Thread Ken Gaillot
mote_0
> >  * Pseudo action:   ms_servant2_promote_0
> >  * Resource action: servant3notify on primary
> >  * Resource action: servant3notify on secondary
> >  * Pseudo action:   ms_servant3_confirmed-pre_notify_promote_0
> >  * Pseudo action:   ms_servant3_promote_0
> >  * Resource action: king_resource   start on primary
> >  * Pseudo action:   ms_king_resource_running_0
> >  * Resource action: servant2promote on secondary
> >  * Pseudo action:   ms_servant2_promoted_0
> >  * Resource action: servant3promote on secondary
> >  * Pseudo action:   ms_servant3_promoted_0
> >  * Pseudo action:   ms_king_resource_post_notify_running_0
> >  * Pseudo action:   ms_servant2_post_notify_promoted_0
> >  * Pseudo action:   ms_servant3_post_notify_promoted_0
> >  * Resource action: king_resource   notify on primary
> >  * Resource action: king_resource   notify on secondary
> >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_running_0
> >  * Resource action: servant2notify on primary
> >  * Resource action: servant2notify on secondary
> >  * Pseudo action:   ms_servant2_confirmed-post_notify_promoted_0
> >  * Resource action: servant3notify on primary
> >  * Resource action: servant3notify on secondary
> >  * Pseudo action:   ms_servant3_confirmed-post_notify_promoted_0
> >  * Pseudo action:   ms_king_resource_pre_notify_promote_0
> >  * Resource action: servant2monitor=11000 on primary
> >  * Resource action: servant2monitor=1 on secondary
> >  * Resource action: servant3monitor=11000 on primary
> >  * Resource action: servant3monitor=1 on secondary
> >  * Resource action: king_resource   notify on primary
> >  * Resource action: king_resource   notify on secondary
> >  * Pseudo action:   ms_king_resource_confirmed-pre_notify_promote_0
> >  * Pseudo action:   ms_king_resource_promote_0
> >  * Resource action: king_resource   promote on secondary
> >  * Pseudo action:   ms_king_resource_promoted_0
> >  * Pseudo action:   ms_king_resource_post_notify_promoted_0
> >  * Resource action: king_resource   notify on primary
> >  * Resource action: king_resource   notify on secondary
> >  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_promoted_0
> >  * Resource action: king_resource   monitor=11000 on primary
> >  * Resource action: king_resource   monitor=1 on secondary
> > Using the original execution date of: 2019-06-29 02:33:03Z
> > 
> > Revised cluster status:
> > Online: [ primary secondary ]
> > 
> >  stk_shared_ip  (ocf::heartbeat:IPaddr2):   Started secondary
> >  Clone Set: ms_king_resource [king_resource] (promotable)
> >  Masters: [ secondary ]
> >  Slaves: [ primary ]
> >  Clone Set: ms_servant1 [servant1]
> >  Started: [ primary secondary ]
> >  Clone Set: ms_servant2 [servant2] (promotable)
> >  Masters: [ secondary ]
> >  Slaves: [ primary ]
> >  Clone Set: ms_servant3 [servant3] (promotable)
> >  Masters: [ secondary ]
> >  Slaves: [ primary ]
> >  servant4(lsb:servant4):  Started secondary
> >  servant5  (lsb:servant5):Started secondary
> >  servant6  (lsb:servant6):Started secondary
> >  servant7  (lsb:servant7):  Started secondary
> >  servant8  (lsb:servant8):Started secondary
> >  Resource Group: servant9_active_disabled
> >  servant9_resource1  (lsb:servant9_resource1):Started
> > secondary
> >  servant9_resource2   (lsb:servant9_resource2): Started
> > secondary
> >  servant10 (lsb:servant10):   Started secondary
> >  servant11 (lsb:servant11):  Started secondary
> >  servant12(lsb:servant12):  Started secondary
> >  servant13(lsb:servant13):  Started secondary
> > 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] two virtual domains start and stop every 15 minutes

2019-07-03 Thread Ken Gaillot
dg-1']//lrm_resource[@id='vm_mouseidgenes']: 
> OK (rc=0, origin=ha-idg-1/crmd/707, version=
> 2.7007.890)
> Jun 19 14:57:32 [9583] ha-idg-1   crmd: info:
> delete_resource:  Removing resource vm_mouseidgenes for 4460a5c3-
> c009-44f6-a01d-52f93e731fda (root) on ha-idg-1
> Jun 19 14:57:32 [9583] ha-idg-1   crmd: info:
> notify_deleted:   Notifying 4460a5c3-c009-44f6-a01d-52f93e731fda on
> ha-idg-1 that vm_mouseidgenes was deleted
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_process_request:  Forwarding cib_delete operation for section
> //node_state[@uname='ha-idg-1']//lrm_resource[@id='vm_mouseidgenes']
> to all (origin=local/crmd/708)
> Jun 19 14:57:32 [9583] ha-idg-1   crmd:  warning:
> qb_ipcs_event_sendv:  new_event_notification (9583-10294-15):
> Broken pipe (32)
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_perform_op:   Diff: --- 2.7007.890 2
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_perform_op:   Diff: +++ 2.7007.891 (null)
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_perform_op:   --
> /cib/status/node_state[@id='1084777482']/lrm[@id='1084777482']/lrm_re
> sources/lrm_resource[@id='vm_mouseidgenes']
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_perform_op:   +  /cib:  @num_updates=891
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_process_request:  Completed cib_delete operation for section
> //node_state[@uname='ha-idg-1']//lrm_resource[@id='vm_mouseidgenes']: 
> OK (rc=0, origin=ha-idg-1/crmd/708, version=
> 2.7007.891)
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_process_request:  Forwarding cib_modify operation for section
> crm_config to all (origin=local/crmd/710)
> 
> Jun 19 14:57:32 [9583] ha-idg-1   crmd: info:
> abort_transition_graph:   > Transition 250 aborted
> <== by deletion of lrm_resource[@id='vm_mouseidgenes']:
> Resource state removal | cib=2.7007.891 source=abort_unless_down:344
> path=/cib/sta
> tus/node_state[@id='1084777482']/lrm[@id='1084777482']/lrm_resources/
> lrm_resource[@id='vm_mouseidgenes'] complete=true
> 
> Jun 19 14:57:32 [9583] ha-idg-1   crmd:   notice:
> do_state_transition:  State transition S_IDLE -> S_POLICY_ENGINE
> | input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
> Jun 19 14:57:32 [9578] ha-idg-1 stonith-ng: info:
> update_cib_stonith_devices_v2:Updating device list from the cib:
> delete lrm_resource[@id='vm_mouseidgenes']
> Jun 19 14:57:32 [9578] ha-idg-1 stonith-ng: info:
> cib_devices_update:   Updating devices to version 2.7007.891
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_perform_op:   Diff: --- 2.7007.891 2
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_perform_op:   Diff: +++ 2.7008.0 (null)
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_perform_op:   +  /cib:  @epoch=7008, @num_updates=0
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_perform_op:   +  /cib/configuration/crm_config/cluster_property_s
> et[@id='cib-bootstrap-options']/nvpair[@id='cib-bootstrap-options-
> last-lrm-refresh']:  @value=1560949052
> Jun 19 14:57:32 [9578] ha-idg-1 stonith-ng:   notice:
> unpack_config:On loss of CCM Quorum: Ignore
> Jun 19 14:57:32 [9577] ha-idg-1cib: info:
> cib_process_request:  Completed cib_modify operation for section
> crm_config: OK (rc=0, origin=ha-idg-1/crmd/710, version=2.7008.0)
> Jun 19 14:57:32 [9583] ha-idg-1   crmd: info:
> abort_transition_graph:   Transition 250 aborted by cib-bootstrap-
> options-last-lrm-refresh doing modify last-lrm-refresh=1560949052:
> Configuration change | cib=2.7008.0 source=te_update_diff_v2:500
> path=/cib/configuration/crm_config/cluster_property_set[@id='cib-
> bootstrap-options']/nvpair[@id='cib-bootstrap-options-last-lrm-
> refresh'] complete=true
> 
> a few minutes later transition 250 is aborted. How can something
> which is completed being aborted ?
> 
> 
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep,
> Heinrich Bassler, Kerstin Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Ken Gaillot
post_notify_promoted_0
>  * Pseudo action:   ms_servant3_post_notify_promoted_0
>  * Resource action: king_resource   notify on primary
>  * Resource action: king_resource   notify on secondary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_running_0
>  * Resource action: servant2notify on primary
>  * Resource action: servant2notify on secondary
>  * Pseudo action:   ms_servant2_confirmed-post_notify_promoted_0
>  * Resource action: servant3notify on primary
>  * Resource action: servant3notify on secondary
>  * Pseudo action:   ms_servant3_confirmed-post_notify_promoted_0
>  * Pseudo action:   ms_king_resource_pre_notify_promote_0
>  * Resource action: servant2monitor=11000 on primary
>  * Resource action: servant2monitor=1 on secondary
>  * Resource action: servant3monitor=11000 on primary
>  * Resource action: servant3monitor=1 on secondary
>  * Resource action: king_resource   notify on primary
>  * Resource action: king_resource   notify on secondary
>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_promote_0
>  * Pseudo action:   ms_king_resource_promote_0
>  * Resource action: king_resource   promote on secondary
>  * Pseudo action:   ms_king_resource_promoted_0
>  * Pseudo action:   ms_king_resource_post_notify_promoted_0
>  * Resource action: king_resource   notify on primary
>  * Resource action: king_resource   notify on secondary
>  * Pseudo action:   ms_king_resource_confirmed-post_notify_promoted_0
>  * Resource action: king_resource   monitor=11000 on primary
>  * Resource action: king_resource   monitor=1 on secondary
> Using the original execution date of: 2019-06-29 02:33:03Z
> 
> Revised cluster status:
> Online: [ primary secondary ]
> 
>  stk_shared_ip  (ocf::heartbeat:IPaddr2):   Started secondary
>  Clone Set: ms_king_resource [king_resource] (promotable)
>  Masters: [ secondary ]
>  Slaves: [ primary ]
>  Clone Set: ms_servant1 [servant1]
>  Started: [ primary secondary ]
>  Clone Set: ms_servant2 [servant2] (promotable)
>  Masters: [ secondary ]
>  Slaves: [ primary ]
>  Clone Set: ms_servant3 [servant3] (promotable)
>  Masters: [ secondary ]
>  Slaves: [ primary ]
>  servant4(lsb:servant4):  Started secondary
>  servant5  (lsb:servant5):Started secondary
>  servant6  (lsb:servant6):Started secondary
>  servant7  (lsb:servant7):  Started secondary
>  servant8  (lsb:servant8):Started secondary
>  Resource Group: servant9_active_disabled
>  servant9_resource1  (lsb:servant9_resource1):Started
> secondary
>  servant9_resource2   (lsb:servant9_resource2): Started secondary
>  servant10 (lsb:servant10):   Started secondary
>  servant11 (lsb:servant11):  Started secondary
>  servant12(lsb:servant12):  Started secondary
>  servant13(lsb:servant13):  Started secondary
> 
> 
> I don't think that there is an issue with the CIB constraints
> configuration, otherwise the resources would not be able to start
> upon bootup, but I'll keep digging and report back if I find any
> cause.
> 
> Thanks again,
> Harvey
> 
> 
> From: Users  on behalf of Ken Gaillot
> 
> Sent: Saturday, 29 June 2019 3:10 a.m.
> To: Cluster Labs - All topics related to open-source clustering
> welcomed
> Subject: EXTERNAL: Re: [ClusterLabs] Problems with master/slave
> failovers
> 
> On Fri, 2019-06-28 at 07:36 +, Harvey Shepherd wrote:
> > Thanks for your reply Andrei. Whilst I understand what you say
> > about
> > the difficulties of diagnosing issues without all of the info, it's
> > a
> > compromise between a mailing list posting being very verbose in
> > which
> > case nobody wants to read it, and containing enough relevant
> > information for someone to be able to help. With 20+ resources
> > involved during a failover there are literally thousands of logs
> > generated, and it would be pointless to post them all.
> > 
> > I've tried to focus in on the king resource only to keep things
> > simple, as that is the only resource that can initiate a failover.
> > I
> > provided the real master scores and transition decisions made by
> > pacemaker at the times that I killed the king master resource by
> > showing the crm_simulator output from both tests, and the CIB
> > config
> > is ss described. As I mentioned, migration-threshold is set to zero
> > for all resources, so it shouldn't prevent a second failover.
> > 
> > Regarding the resource agent return codes, the failure is detected
> > by
> > the 1

Re: [ClusterLabs] Problems with master/slave failovers

2019-07-01 Thread Ken Gaillot
 secondary: 0
> > >> native_color: king_resource:0 allocation score on primary: 0
> > >> native_color: king_resource:0 allocation score on secondary: 200
> > >> native_color: king_resource:1 allocation score on primary: 101
> > >> native_color: king_resource:1 allocation score on secondary:
> > -INFINITY
> > >> king_resource:1 promotion score on primary: 1
> > >> king_resource:0 promotion score on secondary: 1
> > 
> > At this point neither node has clear preference as master for
> > king_resource, so I would expect pacemaker to prefer the current
> > node
> > (it has to break tie somehow). Master scores are normally set by
> > resource agents so you really need to investigate what your agent
> > does
> > and how scores are set when this happens.
> > 
> > >>  * Recoverking_resource:1 ( Master primary )
> > >>  * Pseudo action:   ms_king_resource_pre_notify_demote_0
> > >>  * Resource action: king_resource   notify on secondary
> > >>  * Resource action: king_resource   notify on primary
> > >>  * Pseudo action:   ms_king_resource_confirmed-
> > pre_notify_demote_0
> > >>  * Pseudo action:   ms_king_resource_demote_0
> > >>  * Resource action: king_resource   demote on primary
> > >>  * Pseudo action:   ms_king_resource_demoted_0
> > >>  * Pseudo action:   ms_king_resource_post_notify_demoted_0
> > >>  * Resource action: king_resource   notify on secondary
> > >>  * Resource action: king_resource   notify on primary
> > >>  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_demoted_0
> > >>  * Pseudo action:   ms_king_resource_pre_notify_stop_0
> > >>  * Resource action: king_resource   notify on secondary
> > >>  * Resource action: king_resource   notify on primary
> > >>  * Pseudo action:   ms_king_resource_confirmed-pre_notify_stop_0
> > >>  * Pseudo action:   ms_king_resource_stop_0
> > >>  * Resource action: king_resource   stop on primary
> > >>  * Pseudo action:   ms_king_resource_stopped_0
> > >>  * Pseudo action:   ms_king_resource_post_notify_stopped_0
> > >>  * Resource action: king_resource   notify on secondary
> > >>  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_stopped_0
> > >>  * Pseudo action:   ms_king_resource_pre_notify_start_0
> > >>  * Resource action: king_resource   notify on secondary
> > >>  * Pseudo action:   ms_king_resource_confirmed-
> > pre_notify_start_0
> > >>  * Pseudo action:   ms_king_resource_start_0
> > >>  * Resource action: king_resource   start on primary
> > >>  * Pseudo action:   ms_king_resource_running_0
> > >>  * Pseudo action:   ms_king_resource_post_notify_running_0
> > >>  * Resource action: king_resource   notify on secondary
> > >>  * Resource action: king_resource   notify on primary
> > >>  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_running_0
> > >>  * Pseudo action:   ms_king_resource_pre_notify_promote_0
> > >>  * Resource action: king_resource   notify on secondary
> > >>  * Resource action: king_resource   notify on primary
> > >>  * Pseudo action:   ms_king_resource_confirmed-
> > pre_notify_promote_0
> > >>  * Pseudo action:   ms_king_resource_promote_0
> > >>  * Resource action: king_resource   promote on primary
> > >>  * Pseudo action:   ms_king_resource_promoted_0
> > >>  * Pseudo action:   ms_king_resource_post_notify_promoted_0
> > >>  * Resource action: king_resource   notify on secondary
> > >>  * Resource action: king_resource   notify on primary
> > >>  * Pseudo action:   ms_king_resource_confirmed-
> > post_notify_promoted_0
> > >>  * Resource action: king_resource   monitor=1 on primary
> > >>  Clone Set: ms_king_resource [king_resource] (promotable)
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Pacemaker detail log directory permissions

2019-04-24 Thread Ken Gaillot
On Wed, 2019-04-24 at 16:08 +0200, wf...@niif.hu wrote:
> Hi,
> 
> Make install creates /var/log/pacemaker with mode 0770, owned by
> hacluster:haclient.  However, if I create the directory as root:root
> instead, pacemaker.log appears as hacluster:haclient all the
> same.  What
> breaks in this setup besides log rotation (which can be fixed by
> removing the su directive)?  Why is it a good idea to let the
> haclient
> group write the logs?

Cluster administrators are added to the haclient group. It's a minor
use case, but the group write permission allows such users to run
commands that log to the detail log. An example would be running
"crm_resource --force-start" for a resource agent that writes debug
information to the log.

If ACLs are not in use, such users already have full read/write access
to the CIB, so being able to read and write the log is not an
additional concern.

With ACLs, I could see wanting to change the permissions, and that idea
has come up already. One approach might be to add a PCMK_log_mode
option that would default to 0660, and users could make it more strict
if desired.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Coming in Pacemaker 2.0.2: changes of interest to packagers

2019-04-24 Thread Ken Gaillot
People who build pacemaker packages (especially for official
distributions) may be interested in these changes coming in Pacemaker
2.0.2:

* The bug report URL displayed by crm_report when collecting
information is now configurable. The default is the same as before (the
ClusterLabs bugzilla URL). If a distribution would rather guide users
to the distribution bugzilla for filing bug reports, pacemaker's
configure script now accepts a --with-bug-url option.

* Two private libraries formerly present, libpengine and
libtransitioner, have been combined into a single libpacemaker. As only
private APIs are affected, the change does not break public API
backward compatibility. (The new library will eventually contain some
high-level public APIs.)
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Pacemaker 2.0.2-rc1 now available

2019-04-24 Thread Ken Gaillot
Source code for the first release candidate for Pacemaker version 2.0.2
is now available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.2-rc1

This is primarily a security release, with stricter two-way
authentication of inter-process communication. The most significant
issue this fixes is a privilege escalation vulnerability allowing an
attacker with login access on a node to use an impostor pacemaker
subdaemon to gain root privileges if pacemaker is started after the
impostor.

Since this is a security release, I'm planning on a shorter cycle than
normal, maybe 4-6 weeks before final release. Basically rc1 will remain
unchanged unless we find regressions.

Besides security fixes, this release has some helpful bug fixes and a
few small features:

* crm_resource --validate can now be run using resource parameters from
the command line rather than the CIB, so configurations can be tested
before trying to add them

* crm_resource --clear now prints out any cleared constraints, so you
know when it did something

* A new HealthIOWait resource agent is available for tracking node
health based on CPU I/O wait

* A couple of experimental features discussed earlier on this list: a
new tool crm_rule can check for rule expiration, and stonith_admin now
supports XML output for easier machine parsing.

For more details about changes in this release, please see the change
log:

https://github.com/ClusterLabs/pacemaker/blob/2.0/ChangeLog

Everyone is encouraged to download, compile and test the new release.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all contributors of source code to this release,
including Chris Lumens, Gao,Yan, Jan Pokorný, Jehan-Guillaume de
Rorthais, Ken Gaillot, Klaus Wenninger, and Maciej Sobkowiak.

A 1.1.21-rc1 with selected backports from this release will also be
released soon.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] failure-timeout

2019-04-11 Thread Ken Gaillot
On Thu, 2019-04-11 at 16:17 +0200, fatcha...@gmx.de wrote:
> Hi,
> 
> I´m running a pcs/corosync two node cluster on CentOS 7.6 
> I use a cloned ping resource and I'd like to add a failure-timeout to
> this.
> How do I do this ?
> 
> Any suggestions are welcome
> 
> Kind regards
> 
> fatcharly

The upstream document "Pacemaker Explained" is a good reference for
what options are available:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#_resource_meta_attributes

In this case, failure-timeout is a resource meta-attribute. Look at
pcs's man page to see how to use "pcs resource meta" to set those.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Pacemaker security issues discovered and patched

2019-04-17 Thread Ken Gaillot
Hello all,

Jan Pokorný of Red Hat discovered three security-related issues in
Pacemaker that have been publicly disclosed today.

The most significant is a privilege escalation vulnerability (assigned
CVE-2018-16877). An unprivileged attacker with local access to a
pacemaker node when pacemaker is not running can create a process
pretending to be a pacemaker subdaemon. When pacemaker starts, it will
accept the impostor as valid, and the impostor can then craft messages
to manipulate other pacemaker subdaemons into performing commands as
root.

The other two are less significant. A local attacker can exploit the
same vulnerability for denial-of-service (assigned CVE-2018-16878). An
unrelated use-after-free bug in the alerts code (assigned CVE-2019-
3885) could expose environment variables in the pacemaker log,
resulting in information disclosure of sensitive information kept in
environment variables to local users with permissions to access the
pacemaker log but not wherever the environment variables are set.

Pull requests patching these vulnerabilities for the master and 1.1
branches of pacemaker will be merged shortly:

https://github.com/ClusterLabs/pacemaker/pull/1749

https://github.com/ClusterLabs/pacemaker/pull/1750

Without the patches, a mitigation is to prevent local user access to
cluster nodes except for cluster administrators (which is the
recommended and most common deployment model).

Due to the stricter authentication now imposed, a new requirement
(unlikely to be of interest to most users) is that the hacluster user
and haclient group must exist before running the executor and fencer
regression tests.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Interacting with Pacemaker from my code

2019-07-16 Thread Ken Gaillot
On Tue, 2019-07-16 at 12:28 +0530, Nishant Nakate wrote:
> 
> On Tue, Jul 16, 2019 at 11:33 AM Ulrich Windl <
> ulrich.wi...@rz.uni-regensburg.de> wrote:
> > >>> Nishant Nakate  schrieb am 16.07.2019
> > um 05:37 in
> > Nachricht
> >  > >:
> > > Hi All,
> > > 
> > > I am new to this community and HA tools. Need some guidance on my
> > current
> > > handling pacemaker.
> > > 
> > > For one of my projects, I am using pacekmaker for high
> > availability.
> > > Following the instructions provided in setup document, I am able
> > to create
> > > a new cluster with three nodes. Tested the failover and it works
> > as
> > > expected. Now, I need to run our own code on this nodes. I am
> > going to use
> > > Active-Pasive topology for our system architecture.
> > > 
> > > I would like to know, how can I detect from my code if DC has
> > changed ? So
> > > that tools running on new DC can take over the controll of
> > application
> > > logic.
> > 
> > Why would you need that? crm_mon shows where ther DC is.
> > 
> 
> Thanks you Ulrich for your quick response.
> 
> I will give you a quick overview of the system. There would be 3
> nodes configured in a cluster. One would act as a leader and others
> as followers. Our system would be actively running on all the three
> nodes and serve external services assigned to them. On top of that,
> the Leader would have some services running. In case the leader node
> (DC) fails, load of failed leader needs to be distributed among
> followers. One should get elected as next leader and start the
> additional services the previous leader was running. 

In pacemaker, this is a typical promotable (master/slave) clone
workload. See the Pacemaker Explained section on clones:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#s-resource-clone

(That's for pacemaker 2.0, the syntax is slightly different but the
concepts are the same for 1.1.)

As others have mentioned, you don't need to care about the DC for this
-- pacemaker will automatically select one of your nodes to be your
master.

> In this case, a our system daemon running on the next node
> (designated as new coordinater) needs to know that DC has changed and
> start the additional services. Hope this clarifies my need.
> 
> AFAIK, crm_mon (or dcs) needs to be run from command line. Are you
> suggesting that I create a service which runs crm_mon and parse the
> ourput in a loop to check changes in DC/resources ?
>  
> > > Also, how to configure my processes (tools) as resources so that
> > Pacemaker
> > > can make them HA.
> > 
> > You'll have to write an RA (resource agent) at least, but more
> > importantly the application has the be designed for HA to be
> > effective.
> 
> We are currently designing the services which are all stateless. If
> you have any quick suggestions to be considered, please let me know.
> Also, what type of RA would be helpful in this case. My services
> would be written in C++. 

You can keep your services in C++ and write a small OCF agent as a
shell script wrapper (almost identical to an init script, if you're
familiar with those). See the Pacemaker Administration section on OCF
agents:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Administration/#_ocf_resource_agents

That's pretty basic; you can find more guides elsewhere online. You can
start with the Stateful dummy resource agent shipped with pacemaker as
a template.

One thing that's not covered in detail is master scores. Your agent has
to set a master score for each node, and the node with the highest
master score will be chosen as master. Your agent uses the crm_master
command to do this (see its man page for details).

Good luck, and let us know how it goes.

> > > 
> > > Please let me know if you need any clarification or any other
> > information.
> > > 
> > > Thanks in advance !!!
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] resource location preference vs utilization

2019-07-16 Thread Ken Gaillot
On Mon, 2019-07-15 at 18:41 +0200, wf...@niif.hu wrote:
> Hi,
> 
> In a mostly symmetrical cluster I've got a couple of resources which
> should only ever run on a subset of the nodes if possible.  However,
> utilization constraints seem to prevent optimal resource allocation
> in
> some cases: Pacemaker does not migrate other resources to make room
> for
> the picky resources on their preferred nodes.  Is there a way around
> this?  The score differences are way above the resource stickiness,
> and
> as soon as I manually move enough indifferent resources away from the
> distinguished nodes, Pacemaker indeed migrates the picky resources
> over
> to them.  Can I configure it to also make room automatically by
> moving
> other resources as necessary, making this process fully automatic?

I'm not aware of any knobs to turn to affect that. Have you tried
putting small negative location constraints for the non-picky resources
on those nodes? Also, you could try putting a smaller stickiness on the
non-picky resources (I don't know whether that will have any effect,
just trying to think of ideas).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Antw: Re: Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-13 Thread Ken Gaillot
On Tue, 2019-08-13 at 11:06 +0200, Ulrich Windl wrote:
> Hi,
> 
> an update:
> After setting a failure-timeout for the resource that stale monitor
> failure
> was removed automatically at next cluster recheck (it seems).
> Still I wonder why a resource cleanup didn't do that (bug?).

Possibly ... also possibly fixed already, as there have been a few
clean-up related fixes in the past few versions. I'm not sure what's
been backported in the build you have.

> 
> Regards,
> Ulrich
> 
> 
> > > > "Ulrich Windl"  schrieb am
> > > > 13.08.2019
> 
> um
> 10:07 in Nachricht <5d526fb002a100032...@gwsmtp.uni-regensburg.de
> >:
> > > > > Ken Gaillot  schrieb am 13.08.2019 um
> > > > > 01:03 in
> > 
> > Nachricht
> > :
> > > On Mon, 2019‑08‑12 at 17:46 +0200, Ulrich Windl wrote:
> > > > Hi!
> > > > 
> > > > I just noticed that a "crm resource cleanup " caused some
> > > > unexpected behavior and the syslog message:
> > > > crmd[7281]:  warning: new_event_notification (7281‑97955‑15):
> > > > Broken
> > > > pipe (32)
> > > > 
> > > > It's SLES14 SP4 last updated Sept. 2018 (up since then,
> > > > pacemaker‑
> > > > 1.1.19+20180928.0d2680780‑1.8.x86_64).
> > > > 
> > > > The cleanup was due to a failed monitor. As an unexpected
> > > > consequence
> > > > of this cleanup, CRM seemed to restart the complete resource
> > > > (and
> > > > dependencies), even though it was running.
> > > 
> > > I assume the monitor failure was old, and recovery had already
> > > completed? If not, recovery might have been initiated before the
> > > clean‑
> > > up was recorded.
> > > 
> > > > I noticed that a manual "crm_resource ‑C ‑r  ‑N "
> > > > command
> > > > has the same effect (multiple resources are "Cleaned up",
> > > > resources
> > > > are restarted seemingly before the "probe" is done.).
> > > 
> > > Can you verify whether the probes were done? The DC should log a
> > > message when each _monitor_0 result comes in.
> > 
> > So here's a rough sketch of events:
> > 17:10:23 crmd[7281]:   notice: State transition S_IDLE ->
> > S_POLICY_ENGINE
> > ...no probes yet...
> > 17:10:24 pengine[7280]:  warning: Processing failed monitor of 
> > prm_nfs_server
> > on rksaph11: not running
> > ...lots of starts/restarts...
> > 17:10:24 pengine[7280]:   notice:  * Restartprm_nfs_server  
> > ...
> > 17:10:24 crmd[7281]:   notice: Processing graph 6628
> > (ref=pe_calc-dc-1565622624-7313) derived from
> > /var/lib/pacemaker/pengine/pe-input-1810.bz2
> > ...monitors are being called...
> > 17:10:24 crmd[7281]:   notice: Result of probe operation for
> > prm_nfs_vg on
> > h11: 0 (ok)
> > ...the above was the first probe result...
> > 17:10:24 crmd[7281]:  warning: Action 33 (prm_nfs_vg_monitor_0) on
> > h11 
> > failed
> > (target: 7 vs. rc: 0): Error
> > ...not surprising to me: The resource was running; I don't know why
> > the
> > cluster want to start it...

That's normal, that's how pacemaker detects active resources after
clean-up. It schedules a probe and start in the assumption that the
probe will find the resource not running; if it is running, the probe
result will cause a new transition where the start isn't needed.

That message will be improved in the next version (already in master
branch), like:

notice: Transition 10 action 33 (prm_nfs_vg_monitor_0 on h11): expected
'not running' but got 'ok'

> > 17:10:24 crmd[7281]:   notice: Transition 6629 (Complete=9,
> > Pending=0,
> > Fired=0, Skipped=0, Incomplete=0,
> > Source=/var/lib/pacemaker/pengine/pe-input-1811.bz2): Complete
> > 17:10:24 crmd[7281]:   notice: State transition S_TRANSITION_ENGINE
> > ->
> 
> S_IDLE
> > 
> > The really bad thing after this is that the "cleaned up" resource
> > still has
> > a
> > failed status (dated in the past (last-rc-change='Mon Aug 12
> > 04:52:23 
> > 2019')),
> > even though "running".
> > 
> > I tend to believe that the cluster is in a bad state, or the
> > software has a
> > problem cleaning the status of the monitor.

It does sound like a clean-up bug, but I'm not aware of any current
issues. I suspect it's already fixed.

> > The CIB status for the resource looks like this:
> >  > class="

Re: [ClusterLabs] cloned ethmonitor - upon failure of all nodes

2019-08-15 Thread Ken Gaillot
On Thu, 2019-08-15 at 10:59 +0100, solarmon wrote:
> Hi,
> 
> I have a two node cluster setup where each node is multi-homed over
> two separate external interfaces - net4 and net5 - that can have
> traffic load balanced between them.
> 
> I have created multiple virtual ip resources (grouped together) that
> should only be active on only one of the two nodes.
> 
> I have created ethmonitor resources for net4 and net5 and have
> created a constraint against the virtual ip resources group.
> 
> When one of the net4/net5 interfaces is taken on the active node
> (where the virtual IPs are), the virtual ip resource group switches
> to the other node. This is working as expected.
> 
> However, when either of the net4/net5 interfaces are down on BOTH
> nodes - for example, if net4 is down on BOTH nodes - the cluster
> seems to get itself in to a flapping state where there virtual IP
> resources keeps becoming available then unavailable. Or the virtual
> IP resources group isn't running on any node.
> 
> Since net4 and net5 interfaces can have traffic load-balanced across
> them, it is acceptable for the virtual IP resources to be running any
> of the node, even if the same interface (for example, net4) is down
> on both nodes, since the other interface (for example, net5) is still
> available on both nodes.
> 
> What is the recommended way to configure the ethmonitor and
> constraint resources for this type of multi-homed setup?

It's probably the constraint. When monitoring a single interface, the
location constraint should have rule giving a score of -INFINITY when
the special node attribute's value is 0.

However in your case, your goal is more complicated, so the rule has to
be as well. I'd set a -INFINITY score when *both* attributes are 0
(e.g. ethmonitor-net4 eq 0 and ethmonitor-net5 eq 0). That will keep
the IPs on a node where at least one interface is working.

If you additionally want to prefer a node with both interfaces working,
I'd add 2 more rules giving a slightly negative preference to a node
where a single attribute is 0 (one rule for each attribute).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] 2-node DRBD Pacemaker not performing as expected: Where to next?

2019-08-15 Thread Ken Gaillot
On Thu, 2019-08-15 at 11:25 -0400, Nickle, Richard wrote:
> 
> My objective is two-node active/passive DRBD device which would
> automatically fail over, a secondary objective would be to use
> standard, stock and supported software distributions and repositories
> with as little customization as possible.
> 
> I'm using Ubuntu 18.04.3, plus the DRBD, corosync and Pacemaker that
> are in the (LTS) repositories.  DRBD drbdadm reports version 8.9.10. 
> Corosync is 2.4.3, and Pacemaker is 0.9.164.
> 
> For my test scenario, I would have two nodes up and running, I would
> reboot, disconnect or shut down one node, and the other node would
> then after a delay take over.  That's the scenario I wanted to
> cover:  unexpected loss of a node.  The application is supplementary
> and isn't life safety or mission critical, but it would be measured,
> and the goal would be to stay above 4 nines of uptime annually.
> 
> All of this is working for me, I can manually failover by telling PCS
> to move my resource from one node to another.  If I reboot the
> primary node, the failover will not complete until the primary is
> back online.  Occasionally I'd get split-brain by doing these hard
> kills, which would require manual recovery.
> 
> I added STONITH and watchdog using SBD with an iSCSI block device and
> softdog.  

So far, so good ... except for softdog. Since it's a kernel module, if
something goes wrong at the kernel level, it might fail to execute, and
you might still get split-brain (though much less likely than without
fencing at all). A hardware watchdog or external power fencing is much
more reliable, and if you're looking for 4 9s, it's worth it.

> I added a qdevice to get an odd-numbered quorum.
> 
> When I run crm_simulate on this, the simulation says that if I down
> the primary node, it will promote the resource to the secondary.
> 
> And yet I still see the same behavior:  crashing the primary, there
> is no promotion until after the primary returns online, and after
> that the secondary is smoothly promoted and the primary demoted.
> 
> Getting each component of this stack configured and running has had
> substantial challenges, with regards to compatibility, documentation,
> integration bugs, etc.
> 
> I see other people reporting problems similar to mine, I'm wondering
> if there's a general approach, or perhaps I need a nudge in a new
> direction to tackle this issue?
> 
> * Should I continue to focus on the existing Pacemaker
> configuration?  perhaps there's some hidden or absent
> order/constraint/weighting that is causing this behavior?

It's hard to say without configuration and logs. I'd start by checking
the logs to see whether fencing succeeded when the node was killed. If
fencing fails, pacemaker can't recover anything from the dead node.

> * Should I dig harder at the DRBD configuration?  Is it something
> about the fencing scripts?

It is a good idea to tie DRBD's fencing scripts to pacemaker. The
LINBIT DRBD docs are the best info for that, where it mentions setting
fence-peer to a crm-fence-peer script.

> * Should I try stripping this back down to something more basic?  Can
> I have a reliable failover without STONITH, SBD and an odd-numbered
> quorum?

There's nothing wrong with having both SBD with shared disk, and
qdevice, but you don't need both. If you run qdevice, SBD can get the
quorum information from pacemaker, so it doesn't require the shared
disk.

> * It seems possible that moving to DRBD 9.X might take some of the
> problem off of Pacemaker altogether since it has built in failover
> apparently, is that an easier win?
> * Should I go to another stack?  I'm trying to work within LTS
> releases for stability, but perhaps I would get better integrations
> with RHEL 7, CentOS 7, an edge release of Ubuntu, or some other
> distribution?

There are advantages and disadvantages to changing either of the above,
but I doubt any choice will be easier, just a different set of
roadblocks to work through.
 
> Thank you for your consideration!
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Querying failed rersource operations from the CIB

2019-08-12 Thread Ken Gaillot
On Mon, 2019-08-12 at 11:15 +0200, Ulrich Windl wrote:
> Hi!
> 
> Back in December 2011 I had written a script to retrieve all failed
> resource operations by using "cibadmin -Q -o  lrm_resources" as data
> base. I was querying lrm_rsc_op for op-status != 0.
> In a newer release this does not seems to work anymore.
> 
> I see resource IDs ending with "_last_0", "_monitor_6", and
> "_last_failure_0", but even in the "_last_failure_0" the op-status is
> "0" (rc-code="7").
> Is this some bug, or is it a feature? That is: When will op-status be
> != 0?

rc-code is the result of the action itself (i.e. the resource agent),
whereas op-status is the result of pacemaker's attempt to execute the
agent.

If pacemaker was able to successfully initiate the resource agent and
get a reply back, then op-status will be 0, regardless of the rc-code
reported by the agent.

op-status will be nonzero when it couldn't get a result from the agent
-- the agent is not installed on the node, the agent timed out, the
connection to the local executor or Pacemaker Remote was lost, the
action was requested while the node was shutting down, etc.

There's also a special op-status (193) that indicates an action is
pending (i.e. it has been initiated and we're waiting for it to
complete). This is only seen when record-pending is true.

> crm_mon still reports a resource failure like this:
> Failed Resource Actions:
> * prm_nfs_server_monitor_6 on h11 'not running' (7): call=738,
> status=complete, exitreason='',
> last-rc-change='Mon Aug 12 04:52:23 2019', queued=0ms, exec=0ms
> 
> (it seems the nfs server monitor does this under load in SLES12 SP4,
> and I wonder where to look for the reason)
> BTW: "lrm_resources" is not documented, and the structure seemes to
> change. Can I restrict the output to LRM data?

One possibility is to run crm_mon with --as-xml and parse the failed
actions from that output. The schema is distributed as crm_mon.rng.

> Regards,
> Ulrich
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Q: "crmd[7281]: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup

2019-08-12 Thread Ken Gaillot
On Mon, 2019-08-12 at 17:46 +0200, Ulrich Windl wrote:
> Hi!
> 
> I just noticed that a "crm resource cleanup " caused some
> unexpected behavior and the syslog message:
> crmd[7281]:  warning: new_event_notification (7281-97955-15): Broken
> pipe (32)
> 
> It's SLES14 SP4 last updated Sept. 2018 (up since then, pacemaker-
> 1.1.19+20180928.0d2680780-1.8.x86_64).
> 
> The cleanup was due to a failed monitor. As an unexpected consequence
> of this cleanup, CRM seemed to restart the complete resource (and
> dependencies), even though it was running.

I assume the monitor failure was old, and recovery had already
completed? If not, recovery might have been initiated before the clean-
up was recorded.

> I noticed that a manual "crm_resource -C -r  -N " command
> has the same effect (multiple resources are "Cleaned up", resources
> are restarted seemingly before the "probe" is done.).

Can you verify whether the probes were done? The DC should log a
message when each _monitor_0 result comes in.

> Actually the manual says when cleaning up a single primitive, the
> whole group is cleaned up, unless using --force. Well ,I don't like
> this default, as I expect any status change from probe would
> propagate to the group anyway...

In 1.1, clean-up always wipes the history of the affected resources,
regardless of whether the history is for success or failure. That means
all the cleaned resources will be reprobed. In 2.0, clean-up by default
wipes the history only if there's a failed action (--refresh/-R is
required to get the 1.1 behavior). That lessens the impact of the
"default to whole group" behavior.

I think the original idea was that a group indicates that the resources
are closely related, so changing the status of one member might affect
what status the others report.

> Regards,
> Ulrich
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] why is node fenced ?

2019-08-12 Thread Ken Gaillot
On Mon, 2019-08-12 at 18:09 +0200, Lentes, Bernd wrote:
> Hi,
> 
> last Friday (9th of August) i had to install patches on my two-node
> cluster.
> I put one of the nodes (ha-idg-2) into standby (crm node standby ha-
> idg-2), patched it, rebooted, 
> started the cluster (systemctl start pacemaker) again, put the node
> again online, everything fine.
> 
> Then i wanted to do the same procedure with the other node (ha-idg-
> 1).
> I put it in standby, patched it, rebooted, started pacemaker again.
> But then ha-idg-1 fenced ha-idg-2, it said the node is unclean.
> I know that nodes which are unclean need to be shutdown, that's
> logical.
> 
> But i don't know from where the conclusion comes that the node is
> unclean respectively why it is unclean,
> i searched in the logs and didn't find any hint.

The key messages are:

Aug 09 17:43:27 [6326] ha-idg-1   crmd: info: crm_timer_popped: 
Election Trigger (I_DC_TIMEOUT) just popped (2ms)
Aug 09 17:43:27 [6326] ha-idg-1   crmd:  warning: do_log:   Input 
I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped

That indicates the newly rebooted node didn't hear from the other node
within 20s, and so assumed it was dead.

The new node had quorum, but never saw the other node's corosync, so
I'm guessing you have two_node and/or wait_for_all disabled in
corosync.conf, and/or you have no-quorum-policy=ignore in pacemaker.

I'd recommend two_node: 1 in corosync.conf, with no explicit
wait_for_all or no-quorum-policy setting. That would ensure a
rebooted/restarted node doesn't get initial quorum until it has seen
the other node.

> I put the syslog and the pacemaker log on a seafile share, i'd be
> very thankful if you'll have a look.
> https://hmgubox.helmholtz-muenchen.de/d/53a10960932445fb9cfe/
> 
> Here the cli history of the commands:
> 
> 17:03:04  crm node standby ha-idg-2
> 17:07:15  zypper up (install Updates on ha-idg-2)
> 17:17:30  systemctl reboot
> 17:25:21  systemctl start pacemaker.service
> 17:25:47  crm node online ha-idg-2
> 17:26:35  crm node standby ha-idg1-
> 17:30:21  zypper up (install Updates on ha-idg-1)
> 17:37:32  systemctl reboot
> 17:43:04  systemctl start pacemaker.service
> 17:44:00  ha-idg-1 is fenced
> 
> Thanks.
> 
> Bernd
> 
> OS is SLES 12 SP4, pacemaker 1.1.19, corosync 2.3.6-9.13.1
> 
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Master/slave failover does not work as expected

2019-08-12 Thread Ken Gaillot
On Mon, 2019-08-12 at 23:09 +0300, Andrei Borzenkov wrote:
> 
> 
> On Mon, Aug 12, 2019 at 4:12 PM Michael Powell <
> michael.pow...@harmonicinc.com> wrote:
> > At 07:44:49, the ss agent discovers that the master instance has
> > failed on node mgraid…-0 as a result of a failed ssadm request in
> > response to an ss_monitor() operation.  It issues a crm_master -Q
> > -D command with the intent of demoting the master and promoting the
> > slave, on the other node, to master.  The ss_demote() function
> > finds that the application is no longer running and returns
> > OCF_NOT_RUNNING (7).  In the older product, this was sufficient to
> > promote the other instance to master, but in the current product,
> > that does not happen.  Currently, the failed application is
> > restarted, as expected, and is promoted to master, but this takes
> > 10’s of seconds.
> >  
> > 
> 
> Did you try to disable resource stickiness for this ms?

Stickiness shouldn't affect where the master role is placed, just
whether the resource instances should stay on their current nodes
(independently of whether their role is staying the same or changing).

Are there any constraints that apply to the master role?

Another possibility is that you are mixing crm_master with and without
--lifetime=reboot (which controls whether the master attribute is
transient or permanent). Transient should really be the default but
isn't for historical reasons. It's a good idea to always use --
lifetime=reboot. You could double-check with "cibadmin -Q|grep master-" 
and see if there is more than one entry per node.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] why is node fenced ?

2019-08-14 Thread Ken Gaillot
On Wed, 2019-08-14 at 11:57 +0200, Lentes, Bernd wrote:
> 
> - On Aug 13, 2019, at 1:19 AM, kgaillot kgail...@redhat.com
> wrote:
> 
> 
> > 
> > The key messages are:
> > 
> > Aug 09 17:43:27 [6326] ha-idg-1   crmd: info:
> > crm_timer_popped: Election
> > Trigger (I_DC_TIMEOUT) just popped (2ms)
> > Aug 09 17:43:27 [6326] ha-idg-1   crmd:  warning:
> > do_log:   Input
> > I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped
> > 
> > That indicates the newly rebooted node didn't hear from the other
> > node
> > within 20s, and so assumed it was dead.
> > 
> > The new node had quorum, but never saw the other node's corosync,
> > so
> > I'm guessing you have two_node and/or wait_for_all disabled in
> > corosync.conf, and/or you have no-quorum-policy=ignore in
> > pacemaker.
> > 
> > I'd recommend two_node: 1 in corosync.conf, with no explicit
> > wait_for_all or no-quorum-policy setting. That would ensure a
> > rebooted/restarted node doesn't get initial quorum until it has
> > seen
> > the other node.
> 
> That's my setting:
> 
> expected_votes: 2
>   two_node: 1
>   wait_for_all: 0
> 
> no-quorum-policy=ignore
> 
> I did that because i want be able to start the cluster although one
> node has e.g. a hardware problem.
> Is that ok ?

Well that's why you're seeing what you're seeing, which is also why
wait_for_all was created :)

You definitely don't need no-quorum-policy=ignore in any case. With
two_node, corosync will continue to provide quorum to pacemaker when
one node goes away, so from pacemaker's view no-quorum-policy never
kicks in.

With wait_for_all enabled, the newly joining node wouldn't get quorum
initially, so it wouldn't fence the other node. So that's the trade-
off, preventing this situation vs being able to start one node alone
intentionally. Personally, I'd leave wait_for_all on normally, and
manually change it to 0 whenever I was intentionally taking one node
down for an extended time.

Of course all of that is just recovery, and doesn't explain why the
nodes can't see each other to begin with.

> 
> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep,
> Heinrich Bassler, Kerstin Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Q: "Re-initiated expired calculated failure"

2019-08-14 Thread Ken Gaillot
On Wed, 2019-08-14 at 10:24 +0200, Ulrich Windl wrote:
> (subject changed for existing thread)
> 
> Hi!
> 
> After I had thought the problem with the sticky failed monitor was
> solved
> eventually, I realized that I'm getting a message that I don't really
> understand after each cluster recheck interval:
> 
> pengine[7280]:   notice: Re-initiated expired calculated failure
> prm_nfs_server_monitor_6 (rc=7,
> magic=0:7;4:6568:0:d941efc1-de73-4ee4-b593-f65be9e90726) on h11
> 
> The message repeats absolutely identical. So what does it mean? The 

That one confuses me too.

An expired failure is simply ignored for non-recurring operations. But
for expired failures of a recurring monitor, if the node is up, the
monitor's restart digest is altered, which I believe causes it to be
cancelled and re-scheduled.

The reason in the commit message was "This is particularly relevant for
those with on-fail=block which stick around and are not cleaned up by a
subsequent stop/start."

I don't claim to understand it. :)

> monitor
> did not fail between cluster rechecks, and crm_mon is not displaying
> any failed
> operations.

Probably because it's expired. A clean-up should still get rid of it,
though.

> 
> Regards,
> Ulrich
> 
> 
> > > > Ulrich Windl schrieb am 13.08.2019 um 11:06 in Nachricht
> > > > <5D527D91.124 :
> 
> 161 :
> 60728>:
> > Hi,
> > 
> > an update:
> > After setting a failure-timeout for the resource that stale monitor
> > failure
> > was removed automatically at next cluster recheck (it seems).
> > Still I wonder why a resource cleanup didn't do that (bug?).
> > 
> > Regards,
> > Ulrich
> > 
> > 
> > > > > "Ulrich Windl"  schrieb am
> > > > > 13.08.2019
> 
> um
> > 10:07 in Nachricht <
> > 5d526fb002a100032...@gwsmtp.uni-regensburg.de>:
> > > > > > Ken Gaillot  schrieb am 13.08.2019 um
> > > > > > 01:03 in
> > > 
> > > Nachricht
> > > :
> > > > On Mon, 2019‑08‑12 at 17:46 +0200, Ulrich Windl wrote:
> > > > > Hi!
> > > > > 
> > > > > I just noticed that a "crm resource cleanup " caused
> > > > > some
> > > > > unexpected behavior and the syslog message:
> > > > > crmd[7281]:  warning: new_event_notification (7281‑97955‑15):
> > > > > Broken
> > > > > pipe (32)
> > > > > 
> > > > > It's SLES14 SP4 last updated Sept. 2018 (up since then,
> > > > > pacemaker‑
> > > > > 1.1.19+20180928.0d2680780‑1.8.x86_64).
> > > > > 
> > > > > The cleanup was due to a failed monitor. As an unexpected
> > > > > consequence
> > > > > of this cleanup, CRM seemed to restart the complete resource
> > > > > (and
> > > > > dependencies), even though it was running.
> > > > 
> > > > I assume the monitor failure was old, and recovery had already
> > > > completed? If not, recovery might have been initiated before
> > > > the clean‑
> > > > up was recorded.
> > > > 
> > > > > I noticed that a manual "crm_resource ‑C ‑r  ‑N "
> > > > > command
> > > > > has the same effect (multiple resources are "Cleaned up",
> > > > > resources
> > > > > are restarted seemingly before the "probe" is done.).
> > > > 
> > > > Can you verify whether the probes were done? The DC should log
> > > > a
> > > > message when each _monitor_0 result comes in.
> > > 
> > > So here's a rough sketch of events:
> > > 17:10:23 crmd[7281]:   notice: State transition S_IDLE ->
> > > S_POLICY_ENGINE
> > > ...no probes yet...
> > > 17:10:24 pengine[7280]:  warning: Processing failed monitor of 
> > > prm_nfs_server
> > > on rksaph11: not running
> > > ...lots of starts/restarts...
> > > 17:10:24 pengine[7280]:   notice:  * Restartprm_nfs_server  
> > > ...
> > > 17:10:24 crmd[7281]:   notice: Processing graph 6628
> > > (ref=pe_calc-dc-1565622624-7313) derived from
> > > /var/lib/pacemaker/pengine/pe-input-1810.bz2
> > > ...monitors are being called...
> > > 17:10:24 crmd[7281]:   notice: Result of probe operation for
> > > prm_nfs_vg
> 
> on
> > > h11: 0 (ok)
> > > ...the above was the first probe result...
> > > 17:10:24 crmd[728

Re: [ClusterLabs] node name issues (Could not obtain a node name for corosync nodeid 739512332)

2019-08-22 Thread Ken Gaillot
739512331
> crmd: info: corosync_node_name:  Unable to get node name for
> nodeid 739512331
> crmd: info: pcmk_quorum_notification:Obtaining name for
> new node 739512331
> crmd: info: corosync_node_name:  Unable to get node name for
> nodeid 739512331
> crmd:   notice: get_node_name:   Could not obtain a node name for
> corosync nodeid 739512331
> crmd:   notice: crm_update_peer_state_iter:  Node (null) state is
> now member | nodeid=739512331 previous=unknown
> source=pcmk_quorum_notification
> crmd:   notice: crm_update_peer_state_iter:  Node h12 state is
> now member | nodeid=739512332 previous=unknown
> source=pcmk_quorum_notification
> crmd: info: peer_update_callback:Cluster node h12 is now
> member (was in unknown state)
> crmd: info: corosync_node_name:  Unable to get node name for
> nodeid 739512332
> crmd:   notice: get_node_name:   Defaulting to uname -n for the local
> corosync node name
> ...
> 
> ???
> 
> attrd: info: corosync_node_name:  Unable to get node name for
> nodeid 739512332
> attrd:   notice: get_node_name:   Defaulting to uname -n for the
> local corosync node name
> attrd: info: main:CIB connection active
> ...
> stonith-ng:   notice: get_node_name:   Could not obtain a node name
> for corosync nodeid 739512331
> stonith-ng: info: crm_get_peer:Created entry 956e8bf0-5634-
> 4535-aa72-cdd6cf319d5b/0x1d04440 for node (null)/739512331 (2 total)
> stonith-ng: info: crm_get_peer:Node 739512331 has uuid
> 739512331
> stonith-ng: info: pcmk_cpg_membership: Node 739512331 still
> member of group stonith-ng (peer=(null):7277, counter=0.0, at least
> once)
> stonith-ng: info: crm_update_peer_proc:pcmk_cpg_membership:
> Node (null)[739512331] - corosync-cpg is now online
> stonith-ng:   notice: crm_update_peer_state_iter:  Node (null)
> state is now member | nodeid=739512331 previous=unknown
> source=crm_update_peer_proc
> ...
> attrd: info: corosync_node_name:  Unable to get node name for
> nodeid 739512331
> attrd:   notice: get_node_name:   Could not obtain a node name for
> corosync nodeid 739512331
> attrd: info: crm_get_peer:Created entry 40380a43-c1e2-498a-
> bc9e-d68968acf4d6/0x2572850 for node (null)/739512331 (2 total)
> attrd: info: crm_get_peer:Node 739512331 has uuid 739512331
> attrd: info: pcmk_cpg_membership: Node 739512331 still member
> of group attrd (peer=(null):7279, counter=0.0, at least once)
> attrd: info: crm_update_peer_proc:pcmk_cpg_membership: Node
> (null)[739512331] - corosync-cpg is now online
> attrd:   notice: crm_update_peer_state_iter:  Node (null) state
> is now member | nodeid=739512331 previous=unknown
> source=crm_update_peer_proc
> attrd: info: pcmk_cpg_membership: Node 739512332 still member
> of group attrd (peer=h12:40553, counter=0.1, at least once)
> attrd: info: crm_get_peer:Node 739512331 is now known as h11
> attrd:   notice: attrd_check_for_new_writer:  Recorded new
> attribute writer: h11 (was unset)
> ...
> crmd: info: pcmk_cpg_membership: Node 739512332 joined group
> crmd (counter=0.0, pid=0, unchecked for rivals)
> crmd: info: corosync_node_name:  Unable to get node name for
> nodeid 739512331
> crmd:   notice: get_node_name:   Could not obtain a node name for
> corosync nodeid 739512331
> crmd: info: pcmk_cpg_membership: Node 739512331 still member
> of group crmd (peer=(null):7281, counter=0.0, at least once)
> crmd: info: crm_update_peer_proc:pcmk_cpg_membership: Node
> (null)[739512331] - corosync-cpg is now online
> 
> ???
> 
> crmd: info: pcmk_cpg_membership: Node 739512332 still member
> of group crmd (peer=h12:40555, counter=0.1, at least once)
> crmd: info: crm_get_peer:Node 739512331 is now known as h11
> crmd: info: peer_update_callback:Cluster node h11 is now
> member
> crmd: info: update_dc:   Set DC to h11 (3.0.14)
> crmd: info: crm_update_peer_expected:update_dc: Node
> h11[739512331] - expected state is now member (was (null))
> ...
> 
> I feel this mess with determining the node name is overly
> complicated...
> 
> Regards,
> Ulrich

Complicated, yes -- overly, depends on your point of view :)

Putting "name:" in corosync.conf simplifies things.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Q: "pengine[7280]: error: Characters left over after parsing '10#012': '#012'"

2019-08-22 Thread Ken Gaillot
On Thu, 2019-08-22 at 12:09 +0200, Jan Pokorný wrote:
> On 22/08/19 08:07 +0200, Ulrich Windl wrote:
> > When a second node joined a two-node cluster, I noticed the
> > following error message that leaves me kind of clueless:
> >  pengine[7280]:error: Characters left over after parsing
> > '10#012': '#012'
> > 
> > Where should I look for these characters?

The message is coming from pacemaker's function that scans an integer
from a string (usually user-provided). I'd check the CIB (especially
cluster properties) and /etc/sysconfig/pacemaker (or OS equivalent).

Octal 012 would be a newline/line feed character, so one possibility is
that whatever software was used to edit one of those files added an
encoding of it.

> Given it's pengine related, one of the ideas is it's related to:
> 
> 
https://github.com/ClusterLabs/pacemaker/commit/9cf01f5f987b5cbe387c4e040ff5bfd6872eb0ad

I don't think so, or it would have the action name in it. Also, that
won't take effect until a cluster is entirely upgraded to a version
that supports it.

> Therefore it'd be nothing to try to tackle in the user-facing
> configuration, but some kind of internal confusion, perhaps stemming
> from mixing pacemaker version within the cluster?
> 
> By any chance, do you have an interval of 12 seconds configured
> at any operation for any resource?
> 
> (The only other and unlikely possibility I can immediately see is
> having one of pe-*-series-max cluster options misconfigured.)
> 
> > The message was written after an announced resource move to the new
> > node.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] New status reporting for starting/stopping resources in 1.1.19-8.el7

2019-09-03 Thread Ken Gaillot
On Sat, 2019-08-31 at 03:39 +, Chris Walker wrote:
> Hello,
> The 1.1.19-8 EL7 version of Pacemaker contains a commit ‘Feature:
> crmd: default record-pending to TRUE’ that is not in the ClusterLabs
> Github repo.  This commit changes the reporting for resources that
> are in the process of starting and stopping for (at least) crm_mon
> and crm_resource
> crm_mon
> Resources that are in the process of
> starting
> Old reporting:
> Stopped
> New reporting:
> Starting
> Resources that are in the process of
> stopping
> Old reporting:
> Started
> New reporting:
> Stopping
>  
> crm_resource -r  -W
> Resources that are in the process of
> starting
> Old reporting:
> resource  is NOT running
> New reporting:
> resource  is running on: 
> Resources that are in the process of
> stopping
> Old reporting:
> resource  is running on: 
> New reporting:
> resource  is NOT running
>  
> The change to crm_mon is helpful and accurately reflects the current
> state of the resource, but the new reporting from crm_resource seems
> somewhat misleading.  Was this the intended reporting?  Regardless, 

Interesting, I never looked at how crm_resource showed pending actions.
That should definitely be improved.

The record-pending option itself has been around forever, and so has
this behavior when it is set to true. The only difference is that it
now defaults to true.

> the fact that this commit is not in the upstream ClusterLab repo
> makes me wonder whether this will be the default status reporting
> going forward (I will try the 2.0 branch soon).

It indeed was changed in the 2.0.0 release. RHEL 7 backported the
change from there.

>  
> Thanks,
> Chris
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Q: Recommened directory for RA auxillary files?

2019-09-03 Thread Ken Gaillot
On Mon, 2019-09-02 at 15:23 +0200, Ulrich Windl wrote:
> Hi!
> 
> Are there any recommendations where to place (fixed content) files an
> RA uses?
> Usually my RAs use a separate XML file for the metadata, just to
> allow editing it in XML mode automatically.
> Traditionally I put the file in the same directory as the RA itself
> (like "cat $0.xml" for meta-data).
> Are there any expectations that every file in the RA directory is an
> RA?
> (Currently I'm extending an RA, and I'd like to provide some
> additional user-modifiable template file, and I wonder which path to
> use)
> 
> Regards,
> Ulrich

I believe most (maybe even all modern?) deployments have both lib and
resource.d under /usr/lib/ocf. If you have a custom provider for the RA
under resource.d, it would make sense to use the same pattern under
lib.

If you want to follow the FHS, you might consider /usr/share if you're
installing via custom packages, /usr/local/share if you're just
installing locally, or /srv in either case.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Q: The effect of using "default" attribute in RA metadata

2019-09-05 Thread Ken Gaillot
On Thu, 2019-09-05 at 09:31 +0200, Ulrich Windl wrote:
> > > > Tomas Jelinek  schrieb am 05.09.2019 um
> > > > 09:22 in
> 
> Nachricht
> <651630f8-b871-e4c1-68d8-e6a42dd29...@redhat.com>:
> > Dne 03. 09. 19 v 11:27 Ulrich Windl napsal(a):
> > > Hi!
> > > 
> > > Reading the RA API metadata specification, there is a "default"
> > > attribute 
> > 
> > for "parameter".
> > > I wonder what the effect of specifying a default is: Is it
> > > purely 
> > 
> > documentation (and the RA has to take care it uses the same default
> > value as
> > in the metadata), or will the configuration tools actually use that
> > value if
> > the user did not specify a parameter value?
> > 
> > Pcs doesn't use the default values. If you don't specify a value
> > for an 
> > option, pcs simply doesn't put that option into the CIB leaving it
> > to 
> > the RA to figure out a default value. This has a benefit of always 
> > following the default even if it changes. There is no plan to
> > change the 
> > behavior.
> 
> I see. However changing a default value (that way) can cause
> unexpected
> surprises at the user's end.
> When copying the default to the actual resource configuration at the
> time when
> it was configured could prevent unexpected surprises (and the values
> being used
> are somewhat "documented") in the configuration.
> I agree that it's no longer obvious then whether those default values
> were set
> explicitly or implicitly,
> 
> > 
> > Copying default values to the CIB has at least two disadvantages:
> > 1) If the default in a RA ever changes, the change would have no
> > effect 
> > ‑ a value in the CIB would still be set to the previous default.
> > To 
> > configure it to follow the defaults, one would have to remove the
> > option 
> > value afterwards or a new option to pcs commands to control the
> > behavior 
> > would have to be added.
> 
> Agreed.
> 
> > 2) When a value is the same as its default it would be unclear if
> > the 
> > intention is to follow the default or the user set a value which is
> > the 
> > same as the default by coincidence.
> 
> Agreed.
> 
> Are there any plans to decorate the DTD or RNG with comments some
> day? I think
> that would be the perfect place to describe the meanings.

The standard has its own repo:

https://github.com/ClusterLabs/OCF-spec

The ra/next directory is where we're putting proposed changes (ra-
api.rng is the RNG). Once accepted for the upcoming 1.1 standard, the
changes are copied to the ra/1.1 directory, and at some point, 1.1 will
be officially adopted as the current standard.

So, pull requests are welcome :)

I have an outstanding PR that unfortunately I had to put on the back
burner but should be the last big set of changes for 1.1:

https://github.com/ClusterLabs/OCF-spec/pull/21/files

> 
> Regards,
> Ulrich
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Q: Recommened directory for RA auxillary files?

2019-09-05 Thread Ken Gaillot
On Thu, 2019-09-05 at 07:57 +0200, Ulrich Windl wrote:
> > > > Ken Gaillot  schrieb am 04.09.2019 um
> > > > 16:26 in
> 
> Nachricht
> <2634f19382b90736bdfb80b9c84997111479d337.ca...@redhat.com>:
> > On Wed, 2019‑09‑04 at 10:07 +0200, Jehan‑Guillaume de Rorthais
> > wrote:
> > > On Tue, 03 Sep 2019 09:35:39 ‑0500
> > > Ken Gaillot  wrote:
> > > 
> > > > On Mon, 2019‑09‑02 at 15:23 +0200, Ulrich Windl wrote:
> > > > > Hi!
> > > > > 
> > > > > Are there any recommendations where to place (fixed content)
> > > > > files an
> > > > > RA uses?
> > > > > Usually my RAs use a separate XML file for the metadata, just
> > > > > to
> > > > > allow editing it in XML mode automatically.
> > > > > Traditionally I put the file in the same directory as the RA
> > > > > itself
> > > > > (like "cat $0.xml" for meta‑data).
> > > > > Are there any expectations that every file in the RA
> > > > > directory is
> > > > > an
> > > > > RA?
> > > > > (Currently I'm extending an RA, and I'd like to provide some
> > > > > additional user‑modifiable template file, and I wonder which
> > > > > path
> > > > > to
> > > > > use)
> > > > > 
> > > > > Regards,
> > > > > Ulrich  
> > > > 
> > > > I believe most (maybe even all modern?) deployments have both
> > > > lib
> > > > and
> > > > resource.d under /usr/lib/ocf. If you have a custom provider
> > > > for
> > > > the RA
> > > > under resource.d, it would make sense to use the same pattern
> > > > under
> > > > lib.
> > > 
> > > Shouldn't it be $OCF_FUNCTIONS_DIR?
> > 
> > Good point ‑‑ if the RA is using ocf‑shellfuncs, yes. $OCF_ROOT/lib
> > should be safe if the RA doesn't use ocf‑shellfuncs.
> > 
> > It's a weird situation; the OCF standard actually specifies
> > /usr/ocf,
> > but everyone implemented /usr/lib/ocf. I do plan to add a configure
> > option for it in pacemaker, but it shouldn't be changed unless you
> > can
> > make the same change in every other cluster component that needs
> > it.
> 
> The thing with $OCF_ROOT is: If $OCF_ROOT already contains "/lib", it
> looks
> off to add another "/lib".

It does look weird, but that's the convention in use today.

I hope we eventually get to the point where the .../lib and
.../resource.d locations are configure-time options, and distros can
choose whatever's consistent with their usual policies. For those that
follow the FHS, it might be something like /usr/lib/ocf or
/usr/share/ocf, and /usr/libexec/ocf.

However all cluster components installed on a host must be configured
the same locations, so that will require careful coordination. It's
easier to just keep using the current ones :)

> To me it looks as if it's time for an $OCF_LIB (which would be
> $OCF_ROOT if
> the latter is /usr/lib/ocf already, otherwise $OCF_ROOT/lib).
> Personally I
> think the /usr/ predates the
> [/usr][/share]]/lib/.
> 
> > 
> > > Could this be generalized to RA for their
> > > own lib or permanent dependencies files?
> > 
> > The OCF standard specifies only the resource.d subdirectory, and
> > doesn't comment on adding others. lib/heartbeat is a common choice
> > for
> > the resource‑agents package shell includes (an older approach was
> > to
> > put them as dot files in resource.d/heartbeat, and there are often
> > symlinks at those locations for backward compatibility).
> > 
> > Since "heartbeat" is a resource agent provider name, and the
> > standard
> > specifies that agents go under resource.d/, it does
> > make
> > sense that lib/ would be where RA files would go.
> 
> I wonder when we will be able to retire "heartbeat" ;-) If it's
> supposed to be
> of "vendor" type, maybe replace it with "clusterlabs" at some time...

Definitely, that's been the plan for a while, it's just another change
that will require coordination across multiple components.

The hope is that we can at some point wrap up the OCF 1.1 standard, and
then move forward some of the bigger changes. It's just hard to
prioritize that kind of work when there's a backlog of important stuff.

> 
> Regards,
> Ulrich
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Re: Q: Recommened directory for RA auxillary files?

2019-09-04 Thread Ken Gaillot
On Wed, 2019-09-04 at 10:09 +0200, Jehan-Guillaume de Rorthais wrote:
> On Wed, 04 Sep 2019 07:54:50 +0200
> "Ulrich Windl"  wrote:
> 
> > > > > Ken Gaillot  schrieb am 03.09.2019 um
> > > > > 16:35 in  
> > 
> > Nachricht
> > <979978d5a488aabd9ed4a941ff4eac60c271c84d.ca...@redhat.com>:
> > > On Mon, 2019‑09‑02 at 15:23 +0200, Ulrich Windl wrote:  
> > > > Hi!
> > > > 
> > > > Are there any recommendations where to place (fixed content)
> > > > files an
> > > > RA uses?
> > > > Usually my RAs use a separate XML file for the metadata, just
> > > > to
> > > > allow editing it in XML mode automatically.
> > > > Traditionally I put the file in the same directory as the RA
> > > > itself
> > > > (like "cat $0.xml" for meta‑data).
> > > > Are there any expectations that every file in the RA directory
> > > > is an
> > > > RA?
> > > > (Currently I'm extending an RA, and I'd like to provide some
> > > > additional user‑modifiable template file, and I wonder which
> > > > path to
> > > > use)
> > > > 
> > > > Regards,
> > > > Ulrich  
> > > 
> > > I believe most (maybe even all modern?) deployments have both lib
> > > and
> > > resource.d under /usr/lib/ocf. If you have a custom provider for
> > > the RA
> > > under resource.d, it would make sense to use the same pattern
> > > under
> > > lib.  
> > 
> > So what concrete path are you suggesting?
> > /usr/lib//?
> 
> I would bet on /usr/lib/ocf/lib/ ?

That was what I had in mind. Parallels "heartbeat"
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Q: Recommened directory for RA auxillary files?

2019-09-04 Thread Ken Gaillot
On Wed, 2019-09-04 at 10:07 +0200, Jehan-Guillaume de Rorthais wrote:
> On Tue, 03 Sep 2019 09:35:39 -0500
> Ken Gaillot  wrote:
> 
> > On Mon, 2019-09-02 at 15:23 +0200, Ulrich Windl wrote:
> > > Hi!
> > > 
> > > Are there any recommendations where to place (fixed content)
> > > files an
> > > RA uses?
> > > Usually my RAs use a separate XML file for the metadata, just to
> > > allow editing it in XML mode automatically.
> > > Traditionally I put the file in the same directory as the RA
> > > itself
> > > (like "cat $0.xml" for meta-data).
> > > Are there any expectations that every file in the RA directory is
> > > an
> > > RA?
> > > (Currently I'm extending an RA, and I'd like to provide some
> > > additional user-modifiable template file, and I wonder which path
> > > to
> > > use)
> > > 
> > > Regards,
> > > Ulrich  
> > 
> > I believe most (maybe even all modern?) deployments have both lib
> > and
> > resource.d under /usr/lib/ocf. If you have a custom provider for
> > the RA
> > under resource.d, it would make sense to use the same pattern under
> > lib.
> 
> Shouldn't it be $OCF_FUNCTIONS_DIR?

Good point -- if the RA is using ocf-shellfuncs, yes. $OCF_ROOT/lib
should be safe if the RA doesn't use ocf-shellfuncs.

It's a weird situation; the OCF standard actually specifies /usr/ocf,
but everyone implemented /usr/lib/ocf. I do plan to add a configure
option for it in pacemaker, but it shouldn't be changed unless you can
make the same change in every other cluster component that needs it.

> Could this be generalized to RA for their
> own lib or permanent dependencies files?

The OCF standard specifies only the resource.d subdirectory, and
doesn't comment on adding others. lib/heartbeat is a common choice for
the resource-agents package shell includes (an older approach was to
put them as dot files in resource.d/heartbeat, and there are often
symlinks at those locations for backward compatibility).

Since "heartbeat" is a resource agent provider name, and the standard
specifies that agents go under resource.d/, it does make
sense that lib/ would be where RA files would go.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] stonith-ng - performing action 'monitor' timed out with signal 15

2019-09-16 Thread Ken Gaillot
On Tue, 2019-09-03 at 10:09 +0200, Marco Marino wrote:
> Hi, I have a problem with fencing on a two node cluster. It seems
> that randomly the cluster cannot complete monitor operation for fence
> devices. In log I see:
> crmd[8206]:   error: Result of monitor operation for fence-node2 on
> ld2.mydomain.it: Timed Out
> As attachment there is 
> - /var/log/messages for node1 (only the important part)
> - /var/log/messages for node2 (only the important part) <-- Problem
> starts here
> - pcs status
> - pcs stonith show (for both fence devices)
> 
> I think it could be a timeout problem, so how can I see timeout value
> for monitor operation in stonith devices?
> Please, someone can help me with this problem?
> Furthermore, how can I fix the state of fence devices without
> downtime?
> 
> Thank you

How to investigate depends on whether this is an occasional monitor
failure, or happens every time the device start is attempted. From the
status you attached, I'm guessing it's at start.

In that case, my next step (since you've already verified ipmitool
works directly) would be to run the fence agent manually using the same
arguments used in the cluster configuration.

Check the man page for the fence agent, looking at the section for
"Stdin Parameters". These are what's used in the cluster configuration,
so make a note of what values you've configured. Then run the fence
agent like this:

echo -e "action=status\nPARAMETER=VALUE\nPARAMETER=VALUE\n..." | /path/to/agent

where PARAMETER=VALUE entries are what you have configured in the
cluster. If the problem isn't obvious from that, you can try adding a
debug_file parameter.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Why is last-lrm-refresh part of the CIB config?

2019-09-09 Thread Ken Gaillot
On Mon, 2019-09-09 at 11:06 +0200, Ulrich Windl wrote:
> Hi!
> 
> In recent pacemaker I see that last-lrm-refresh is included in the
> CIB config (crm_config/cluster_property_set), so CIBs are "different"
> when they are actually the same.
> 
> Example diff:
> -   name="last-lrm-refresh" value="1566194010"/>
> +   name="last-lrm-refresh" value="1567945827"/>
> 
> I don't see a reason for having that. Can someone explain?
> 
> Regards,
> Ulrich

New transitions (re-calculation of cluster status) are triggered by
changes in the CIB. last-lrm-refresh isn't really special in any way,
it's just a value that can be changed arbitrarily to trigger a new
transition when nothing "real" is changing.

I'm not sure what would actually be setting it these days; its use has
almost vanished in recent code. I think it was used more commonly for
clean-ups in the past.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Corosync main process was not scheduled for 2889.8477 ms (threshold is 800.0000 ms), though it runs with realtime priority and there was not much load on the node

2019-09-09 Thread Ken Gaillot
On Mon, 2019-09-09 at 14:21 +0200, wf...@niif.hu wrote:
> Andrei Borzenkov  writes:
> 
> > 04.09.2019 0:27, wf...@niif.hu пишет:
> > 
> > > Jeevan Patnaik  writes:
> > > 
> > > > [16187] node1 corosyncwarning [MAIN  ] Corosync main process
> > > > was not
> > > > scheduled for 2889.8477 ms (threshold is 800. ms). Consider
> > > > token
> > > > timeout increase.
> > > > [...]
> > > > 2. How to fix this? We have not much load on the nodes, the
> > > > corosync is
> > > > already running with RT priority.
> > > 
> > > Does your corosync daemon use a watchdog device?  (See in the
> > > startup
> > > logs.)  Watchdog interaction can be *slow*.
> > 
> > Can you elaborate? This is the first time I see that corosync has
> > anything to do with watchdog. How exactly corosync interacts with
> > watchdog? Where in corosync configuration watchdog device is
> > defined?
> 
> Inside the resources directive you can specify a watchdog_device, 

Side comment: corosync's built-in watchdog handling is an older
alternative to sbd, the watchdog manager that pacemaker uses. You'd use
one or the other.

If you're running pacemaker on top of corosync, you'd probably want sbd
since pacemaker can use it for more situations than just cluster
membership loss.

> which
> Corosync will "pet" from its main loop.  From corosync.conf(5):
> 
> > In a cluster with properly configured power fencing a watchdog
> > provides no additional value.  On the other hand, slow watchdog
> > communication may incur multi-second delays in the Corosync main
> > loop,
> > potentially breaking down membership.  IPMI watchdogs are
> > particularly
> > notorious in this regard: read about kipmid_max_busy_us in IPMI.txt
> > in
> > the Linux kernel documentation.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] op stop timeout update causes monitor op to fail?

2019-09-11 Thread Ken Gaillot
On Tue, 2019-09-10 at 09:54 +0200, Dennis Jacobfeuerborn wrote:
> Hi,
> I just updated the timeout for the stop operation on an nfs cluster
> and
> while the timeout was update the status suddenly showed this:
> 
> Failed Actions:
> * nfsserver_monitor_1 on nfs1aqs1 'unknown error' (1): call=41,
> status=Timed Out, exitreason='none',
> last-rc-change='Tue Aug 13 14:14:28 2019', queued=0ms, exec=0ms

Are you sure it wasn't already showing that? The timestamp of that
error is Aug 13, while the logs show the timeout update happening Sep
10.

Old errors will keep showing up in status until you manually clean them
up (with crm_resource --cleanup or a higher-level tool equivalent), or
any configured failure-timeout is reached.

In any case, the log excerpt shows that nothing went wrong during the
time it covers. There were no actions scheduled in that transition in
response to the timeout change (which is as expected).

> 
> The command used:
> pcs resource update nfsserver op stop timeout=30s
> 
> I can't imagine that this is expected to happen. Is there another way
> to
> update the timeout that doesn't cause this?
> 
> I attached the log of the transition.
> 
> Regards,
>   Dennis
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Strange behaviour of group resource

2019-07-30 Thread Ken Gaillot
On Tue, 2019-07-30 at 16:26 +0530, Dileep V Nair wrote:
> Thanks Ken for the response. I see below errors. Not sure why it says
> target: 7 vs. rc: 0. Does that mean that pacemaker expect the
> resource to be stopped and since it is running, it is taking an
> action ?
> 
> Jul 30 10:08:59 dntstdb2s0703 cib[90848]: warning: A-Sync reply to
> crmd failed: No message of desired type
> 
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: warning: Action 16 (fs-
> sapdata4_monitor_0) on dntstdb2s0703 failed (target: 7 vs. rc: 0):
> Error
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: notice: Transition 1445
> aborted by operation fs-sapdata4_monitor_0 'modify' on dntstdb2s0703:
> Event failed

These actually aren't errors, and they're expected after a clean-up. I
recently merged a change to make the message more accurate. As of the
next release, it will look like:

notice: Transition 1445 action 5 (fs-sapdata4_monitor_0 on dntstdb2s0703): 
expected 'not running' but got 'ok'

Cleaning up a resource involves clearing its history. That makes the
cluster expect that it is stopped. The cluster then runs probes to find
out the actual status, and if the probe finds it running, the above
situation happens.

So, that's not causing the restarts. An actual failure that could cause
restarts would have a similar message, but the rc would be something
other than 0 or 7.

> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: warning: Action 16 (fs-
> sapdata4_monitor_0) on dntstdb2s0703 failed (target: 7 vs. rc: 0):
> Error
> Jul 30 10:09:04 dntstdb2s0703 stonith-ng[90849]: notice: On loss of
> CCM Quorum: Ignore
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: notice: Result of probe
> operation for fs-saptmp3 on dntstdb2s0703: 0 (ok)
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: warning: Action 19 (fs-
> saptmp3_monitor_0) on dntstdb2s0703 failed (target: 7 vs. rc: 0):
> Error
> Jul 30 10:09:04 dntstdb2s0703 crmd[90853]: warning: Action 19 (fs-
> saptmp3_monitor_0) on dntstdb2s0703 failed (target: 7 vs. rc: 0):
> Error
> 
> Thanks & Regards
> 
> Dileep Nair
> Squad Lead - SAP Base 
> Togaf Certified Enterprise Architect
> IBM Services for Managed Applications
> +91 98450 22258 Mobile
> dilen...@in.ibm.com
> 
> IBM Services
> 
> 
> Ken Gaillot ---07/30/2019 12:47:52 AM---On Thu, 2019-07-25 at 20:51
> +0530, Dileep V Nair wrote: > Hi,
> 
> From: Ken Gaillot 
> To: Cluster Labs - All topics related to open-source clustering
> welcomed 
> Date: 07/30/2019 12:47 AM
> Subject: [EXTERNAL] Re: [ClusterLabs] Strange behaviour of group
> resource
> Sent by: "Users" 
> 
> 
> 
> On Thu, 2019-07-25 at 20:51 +0530, Dileep V Nair wrote:
> > Hi,
> > 
> > I have around 10 filesystems in a group. When I do a crm resource
> > refresh, the filesystems are unmounted and remounted, starting from
> > the fourth resource in the group. Any idea what could be going on,
> is
> > it expected ?
> 
> No, it sounds like some of the reprobes are failing. The logs may
> have
> more info. Each filesystem will have a probe like RSCNAME_monitor_0
> on
> each node.
> 
> > 
> > Thanks & Regards
> > 
> > Dileep Nair
> > Squad Lead - SAP Base 
> > Togaf Certified Enterprise Architect
> > IBM Services for Managed Applications
> > +91 98450 22258 Mobile
> > dilen...@in.ibm.com
> > 
> > IBM Services
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-30 Thread Ken Gaillot
On Tue, 2019-07-30 at 09:39 +, Somanath Jeeva wrote:
> Hi ,
> 
> I am compiling and installing the server on EL 7 server. Also I tried
> once again with systemd-devel and dbus-devel installed , but still I
> am unable to use system resources.
> 
> Is there any other dependency that I have to install.
> 
> 
> 
> With Regards
> Somanath Thilak J

Check your config.log for messages starting at "systemd version query".

On EL7 though, I'd recommend using pre-built packages, whether the
supported ones or from EPEL. If you really want to compile your own,
I'd go with the latest 1.1 or 2.0 version (currently 1.1.21 or 2.0.2).

> 
> -Original Message-
> From: Ken Gaillot  
> Sent: Tuesday, July 30, 2019 12:53 AM
> To: Somanath Jeeva ; Cluster Labs - All
> topics related to open-source clustering welcomed <
> users@clusterlabs.org>; Tomas Jelinek 
> Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> 
> On Fri, 2019-07-26 at 04:46 +, Somanath Jeeva wrote:
> > Hi Ken,
> > 
> > I am using the below versions
> > 
> > Pacemaker - 1.1.16
> > Corosync - 2.4.3
> > PCS - 0.9
> > Resource agents - 3.9.6
> > 
> > During running configure I didn’t give any options. I just ran 
> > configure.sh and then did a make install.
> 
> In that case, it will check whether the build machine has everything
> needed to support systemd. The main requirement is that the systemd
> and dbus development libraries (systemd-devel and dbus-devel) are
> available.
> 
> > 
> > With Regards
> > Somanath Thilak J
> > 
> > -Original Message-
> > From: Ken Gaillot 
> > Sent: Thursday, July 25, 2019 7:22 PM
> > To: Cluster Labs - All topics related to open-source clustering 
> > welcomed ; Tomas Jelinek <
> > tojel...@redhat.com>
> > Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> > 
> > On Thu, 2019-07-25 at 07:23 +, Somanath Jeeva wrote:
> > > Hi
> > > 
> > > Systemd unit file is available for haproxy but the pcs resource 
> > > standard command does not list systemd standard .
> > > 
> > > Also I am not using the pacemaker packages from redhat. I am
> > > using 
> > > the packages downloaded from clusterlabs.
> > 
> > Hi Somanath,
> > 
> > Which version of pacemaker are you using?
> > 
> > If you built it from source, did you give any options to the
> > configure 
> > command?
> > 
> > > 
> > > 
> > > 
> > > With Regards
> > > Somanath Thilak J
> > > 
> > > -Original Message-
> > > From: Tomas Jelinek 
> > > Sent: Monday, July 15, 2019 5:58 PM
> > > To: users@clusterlabs.org
> > > Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> > > 
> > > Hi,
> > > 
> > > Do you have a systemd unit file for haproxy installed?
> > > Does 'crm_resource --list-standards' print 'systemd'?
> > > Does 'crm_resource --list-agents systemd' print 'haproxy'?
> > > Note that when you use full agent name (that is including : ) it
> > > is 
> > > case sensitive in pcs.
> > > 
> > > Regards,
> > > Tomas
> > > 
> > > 
> > > Dne 11. 07. 19 v 10:14 Somanath Jeeva napsal(a):
> > > > Hi
> > > > 
> > > > I am using the resource agents built from clusterlabs and when
> > > > I 
> > > > add the systemd resource I am getting the below error .
> > > > 
> > > > $ sudo pcs resource create HAPROXY systemd:haproxy op monitor 
> > > > interval=2s
> > > > Error: Agent 'systemd:haproxy' is not installed or does not 
> > > > provide valid metadata: Metadata query for systemd:haproxy
> > > > failed: 
> > > > -22, use --force to override
> > > > 
> > > > 
> > > > 
> > > > With Regards
> > > > Somanath Thilak J
> > > > 
> > > > -Original Message-
> > > > From: Kristoffer Grönlund 
> > > > Sent: Thursday, July 11, 2019 1:22 PM
> > > > To: Cluster Labs - All topics related to open-source
> > > > clustering 
> > > > welcomed 
> > > > Cc: Somanath Jeeva 
> > > > Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> > > > 
> > > > On 2019-07-11 09:31, Somanath Jeeva wrote:
> > > > > Hi All,
> > > > > 
> > > > > I am using HAProxy in my environment  which I 

Re: [ClusterLabs] "resource cleanup" - but error message does not dissapear

2019-07-30 Thread Ken Gaillot
On Tue, 2019-07-30 at 19:18 +0200, Lentes, Bernd wrote:
> Hi,
> 
> i always have on one of my cluster nodes "crm_mon -nfrALm 3" running
> in a ssh session,
> which gives a good and short overview of the status of the cluster.
> I just had some problems in live migrating some VirtualDomains.
> These are the errors i see:
> Failed Resource Actions:
> * vm_genetrap_migrate_to_0 on ha-idg-1 'unknown error' (1): call=324,
> status=complete, exitreason='genetrap: live migration to ha-idg-2-
> private failed: 1',
> last-rc-change='Tue Jul 30 18:38:51 2019', queued=0ms,
> exec=42895ms
> * vm_idcc_devel_migrate_to_0 on ha-idg-1 'unknown error' (1):
> call=321, status=complete, exitreason='idcc_devel: live migration to
> ha-idg-2-private failed: 1',
> last-rc-change='Tue Jul 30 18:38:51 2019', queued=0ms,
> exec=35885ms
> * vm_mausdb_migrate_to_0 on ha-idg-1 'unknown error' (1): call=312,
> status=complete, exitreason='mausdb: live migration to ha-idg-2-
> private failed: 1',
> last-rc-change='Tue Jul 30 18:38:51 2019', queued=0ms,
> exec=37254ms
> * vm_geneious_migrate_to_0 on ha-idg-1 'unknown error' (1): call=318,
> status=complete, exitreason='geneious: live migration to ha-idg-2-
> private failed: 1',
> last-rc-change='Tue Jul 30 18:38:51 2019', queued=1ms,
> exec=36175ms
> * vm_severin_migrate_to_0 on ha-idg-1 'unknown error' (1): call=333,
> status=complete, exitreason='severin: live migration to ha-idg-2-
> private failed: 1',
> last-rc-change='Tue Jul 30 18:38:51 2019', queued=1ms,
> exec=36265ms
> * vm_sim_migrate_to_0 on ha-idg-1 'unknown error' (1): call=315,
> status=complete, exitreason='sim: live migration to ha-idg-2-private
> failed: 1',
> last-rc-change='Tue Jul 30 18:38:51 2019', queued=1ms,
> exec=41875ms
> 
> What i'm used to do is to invoke a "resource cleanup" to get rid of
> these messages.
> But now it just worked for two of the messages, the errors for the
> other six remained !?!
> 
> Any idea ?
> 
> 
> Bernd

There was a regression in 1.1.20 and 2.0.0 (fixed in the next versions)
where cleanups of multiple errors would miss some of them. Any chance
you're using one of those?
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: how to connect to the cluster from a docker container

2019-08-07 Thread Ken Gaillot
On Wed, 2019-08-07 at 14:42 +0200, Dejan Muhamedagic wrote:
> Hi,
> 
> On Wed, Aug 07, 2019 at 11:23:09AM +0200, Klaus Wenninger wrote:
> > On 8/7/19 10:09 AM, Dejan Muhamedagic wrote:
> > > Hi Ulrich,
> > > 
> > > On Tue, Aug 06, 2019 at 02:38:10PM +0200, Ulrich Windl wrote:
> > > > > > > Dejan Muhamedagic  schrieb am
> > > > > > > 06.08.2019 um 10:37 in
> > > > 
> > > > Nachricht <20190806083726.GA8262@capote>:
> > > > > Hi,
> > > > > 
> > > > > Hawk runs in a docker container on one of the cluster nodes
> > > > > (the
> > > > > nodes run Debian and apparently it's rather difficult to
> > > > > install
> > > > > hawk on a non‑SUSE distribution, hence docker). Now, how to
> > > > > connect to the cluster? Hawk uses the pacemaker command line
> > > > > tools such as cibadmin. I have a vague recollection that
> > > > > there is
> > > > > a way to connect over tcp/ip, but, if that is so, I cannot
> > > > > find
> > > > > any documentation about it.
> > > > 
> > > > I always thought hawk has to run on one of the cluster nodes
> > > > (natively).
> > > 
> > > Well, let's see if that is the case. BTW, the Dockerfile is
> > > available here:
> > > 
> > > https://github.com/krig/docker-hawk
> > > 
> > > Cheers,
> > > 
> > > Dejan
> > 
> > That container seems to be foreseen to act as a cluster-node
> > controlling docker-containers on the same host.
> > If the pacemaker-version inside the container is close enough
> > to the pacemaker-version you are running on debian and
> > if it has pacemaker-remote you might be able to run the
> > container as guest-node.
> > No idea though if tooling hawk uses is gonna be happy tunneling
> > through pacemaker-remote.
> 
> hawk seems to be using only the standard pacemaker-cli-utils
> (cibadmin etc).
> 
> > A little bit like hypervisors are doing it nowadays - running the
> > admin-interface in a VM ...
> > Of course just useful if you can live with hawk not being
> > available if the cluster is in a state where it doesn't start
> > the guest-node.
> 
> Interesting idea. Would then cibadmin et al work from this remote
> node?

Yes, that sounds like a really good option. There may still be a few
command-line options here and there that aren't remote friendly, but
that should be rare with recent versions.

I'd launch the container via a bundle, for simplicity.

I don't know everything hawk can do, but obviously you couldn't start
the cluster via hawk using that setup! It may also open up some new
avenues for trouble, e.g. the user may be able to disable the hawk
resource via hawk, but couldn't enable it again without resorting to
the command line.

Whatever approach you go with, in your case it's important to keep the
pacemaker version inside the container the same or newer than the rest
of the cluster. That's because it will need the schema files to
validate the cluster configuration. (This isn't important for most
containers, since they don't run any configuration commands.)

> 
> Cheers,
> 
> Dejan
> 
> > Klaus
> > > 
> > > > > Cheers,
> > > > > 
> > > > > Dejan

-- Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Reusing resource set in multiple constraints

2019-07-29 Thread Ken Gaillot
On Sat, 2019-07-27 at 11:04 +0300, Andrei Borzenkov wrote:
> Is it possible to have single definition of resource set that is
> later
> references in order and location constraints? All syntax in
> documentation or crmsh presumes inline set definition in location or
> order statement.
> 
> In this particular case there will be set of filesystems that need to
> be
> colocated and ordered against other resources; these filesystems will
> be
> extended over time and I would like to add new definition in just one
> place.
> 

Tags can do what you want:
https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#_tagging_configuration_elements

-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Strange behaviour of group resource

2019-07-29 Thread Ken Gaillot
On Thu, 2019-07-25 at 20:51 +0530, Dileep V Nair wrote:
> Hi,
> 
> I have around 10 filesystems in a group. When I do a crm resource
> refresh, the filesystems are unmounted and remounted, starting from
> the fourth resource in the group. Any idea what could be going on, is
> it expected ?

No, it sounds like some of the reprobes are failing. The logs may have
more info. Each filesystem will have a probe like RSCNAME_monitor_0 on
each node.

> 
> Thanks & Regards
> 
> Dileep Nair
> Squad Lead - SAP Base 
> Togaf Certified Enterprise Architect
> IBM Services for Managed Applications
> +91 98450 22258 Mobile
> dilen...@in.ibm.com
> 
> IBM Services
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-29 Thread Ken Gaillot
On Fri, 2019-07-26 at 04:46 +, Somanath Jeeva wrote:
> Hi Ken,
> 
> I am using the below versions
> 
> Pacemaker - 1.1.16
> Corosync - 2.4.3
> PCS - 0.9
> Resource agents - 3.9.6
> 
> During running configure I didn’t give any options. I just ran
> configure.sh and then did a make install.

In that case, it will check whether the build machine has everything
needed to support systemd. The main requirement is that the systemd and
dbus development libraries (systemd-devel and dbus-devel) are
available.

> 
> With Regards
> Somanath Thilak J
> 
> -Original Message-
> From: Ken Gaillot  
> Sent: Thursday, July 25, 2019 7:22 PM
> To: Cluster Labs - All topics related to open-source clustering
> welcomed ; Tomas Jelinek 
> Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> 
> On Thu, 2019-07-25 at 07:23 +, Somanath Jeeva wrote:
> > Hi
> > 
> > Systemd unit file is available for haproxy but the pcs resource 
> > standard command does not list systemd standard .
> > 
> > Also I am not using the pacemaker packages from redhat. I am using
> > the 
> > packages downloaded from clusterlabs.
> 
> Hi Somanath,
> 
> Which version of pacemaker are you using?
> 
> If you built it from source, did you give any options to the
> configure command?
> 
> > 
> > 
> > 
> > With Regards
> > Somanath Thilak J
> > 
> > -Original Message-
> > From: Tomas Jelinek 
> > Sent: Monday, July 15, 2019 5:58 PM
> > To: users@clusterlabs.org
> > Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> > 
> > Hi,
> > 
> > Do you have a systemd unit file for haproxy installed?
> > Does 'crm_resource --list-standards' print 'systemd'?
> > Does 'crm_resource --list-agents systemd' print 'haproxy'?
> > Note that when you use full agent name (that is including : ) it
> > is 
> > case sensitive in pcs.
> > 
> > Regards,
> > Tomas
> > 
> > 
> > Dne 11. 07. 19 v 10:14 Somanath Jeeva napsal(a):
> > > Hi
> > > 
> > > I am using the resource agents built from clusterlabs and when I
> > > add 
> > > the systemd resource I am getting the below error .
> > > 
> > > $ sudo pcs resource create HAPROXY systemd:haproxy op monitor 
> > > interval=2s
> > > Error: Agent 'systemd:haproxy' is not installed or does not
> > > provide 
> > > valid metadata: Metadata query for systemd:haproxy failed: -22,
> > > use 
> > > --force to override
> > > 
> > > 
> > > 
> > > With Regards
> > > Somanath Thilak J
> > > 
> > > -Original Message-
> > > From: Kristoffer Grönlund 
> > > Sent: Thursday, July 11, 2019 1:22 PM
> > > To: Cluster Labs - All topics related to open-source clustering 
> > > welcomed 
> > > Cc: Somanath Jeeva 
> > > Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> > > 
> > > On 2019-07-11 09:31, Somanath Jeeva wrote:
> > > > Hi All,
> > > > 
> > > > I am using HAProxy in my environment  which I plan to add to 
> > > > pacemaker as resource. I see no RA available for that in
> > > > resource 
> > > > agent.
> > > > 
> > > > Should I write a new RA or is there any way to add it to
> > > > pacemaker 
> > > > as a systemd service.
> > > 
> > > Hello,
> > > 
> > > haproxy works well as a plain systemd service, so you can add it
> > > as 
> > > systemd:haproxy - that is, instead of an ocf: prefix, just put 
> > > systemd:.
> > > 
> > > If you want the cluster to manage multiple, differently
> > > configured 
> > > instances of haproxy, you might have to either create custom
> > > systemd 
> > > service scripts for each one, or create an agent with parameters.
> > > 
> > > Cheers,
> > > Kristoffer
> > > 
> > > > 
> > > > 
> > > > 
> > > > With Regards
> > > > Somanath Thilak J
> > > > 
> > > > 
> > > > _______________
> > > > Manage your subscription:
> > > > 
> 
> https://protect2.fireeye.com/url?k=28466b53-74926310-28462bc8-86a1150
> > > > b
> > > > c3ba-
> > > > bb674f3a9b557cbd=1=https%3A%2F%2Flists.clusterlabs.org%2Fma
> > > > i
> > > > l
> > > > man%2Flistinfo%2Fusers
> > > > 
> > > > ClusterLabs home:
> > > > 
> 
> https://protect2.fireeye.com/url?k=4c5edd73-108ad530-4c5e9de8-86a1150
> > > > b c3ba-5da4e39ebe912cdf=1=https%3A%2F%2F
> > > > www.clusterlabs.org%2F
> > > 
> > > ___
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > 
> > > ClusterLabs home: https://www.clusterlabs.org/
> > > 
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> 
> --
> Ken Gaillot 
> 
> 
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] how to connect to the cluster from a docker container

2019-08-06 Thread Ken Gaillot
On Tue, 2019-08-06 at 14:03 +0200, Jan Pokorný wrote:
> On 06/08/19 13:36 +0200, Jan Pokorný wrote:
> > On 06/08/19 10:37 +0200, Dejan Muhamedagic wrote:
> > > Hawk runs in a docker container on one of the cluster nodes (the
> > > nodes run Debian and apparently it's rather difficult to install
> > > hawk on a non-SUSE distribution, hence docker). Now, how to
> > > connect to the cluster? Hawk uses the pacemaker command line
> > > tools such as cibadmin. I have a vague recollection that there is
> > > a way to connect over tcp/ip, but, if that is so, I cannot find
> > > any documentation about it.

I think one of the solutions Jan suggested would be best, but what
you're likely remembering is remote-tls-port:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Administration/#s-remote-connection

However that only works for the CIB, so anything that needed to contact
other daemons wouldn't work.

> > 
> > I think that what you are after is one of:
> > 
> > 1. have docker runtime for the particular container have the
> > abstract
> >Unix sockets shared from the host (--network=host? don't
> > remember
> >exactly)
> > 
> >- apparently, this weak style of compartmentalization comes with
> >  many drawbacks, so you may be facing hefty work of cutting any
> >  other interferences stemming from pre-chrooting assumptions of
> >  what is a singleton on the system, incl. sockets etc.
> > 
> > 2. use modern enough libqb (v1.0.2+) and use
> > 
> >  touch /etc/libqb/force-filesystem-sockets
> > 
> >on both host and within the container (assuming those two
> > locations
> >are fully disjoint, i.e., not an overlay-based reuse), you
> > should
> >then be able to share the respective reified sockets simply by
> >sharing the pertaining directory (normally /var/run it seems)
> > 
> >- if indeed a directory as generic as /var/run is involved,
> >  it may also lead to unexpected interferences, so the more
> >  minimalistic the container is, the better I think
> >  (or you can recompile libqb and play with path mapping
> >  in container configuration to achieve smoother plug-in)
> 
> Oh, and there's additional prerequisite for both to at least
> theoretically work -- 1:1 sharing of /dev/shm (which may also
> be problematic in a sense).
> 
> > Then, pacemaker utilities would hopefully work across the container
> > boundaries just as if they were fully native, hence hawk shall as
> > well.
> > 
> > Let us know how far you'll get and where we can colletively join
> > you
> > in your attempts, I don't think we had such experience disseminated
> > here.  I know for sure I haven't ever tried this in practice, some
> > one else here could have.  Also, there may be a lot of fun with
> > various
> > Linux Security Modules like SELinux.
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] corosync.service (and sbd.service) are not stopper on pacemaker shutdown when corosync-qdevice is used

2019-08-09 Thread Ken Gaillot
On Fri, 2019-08-09 at 08:19 +, Roger Zhou wrote:
> 
> On 8/9/19 3:39 PM, Jan Friesse wrote:
> > Roger Zhou napsal(a):
> > > 
> > > On 8/9/19 2:27 PM, Roger Zhou wrote:
> > > > 
> > > > On 7/29/19 12:24 AM, Andrei Borzenkov wrote:
> > > > > corosync.service sets StopWhenUnneded=yes which normally
> > > > > stops it when
> > > > > pacemaker is shut down.
> > > 
> > > One more thought,
> > > 
> > > Make sense to add "RefuseManualStop=true" to pacemaker.service?
> > > The same for corosync-qdevice.service?
> > > 
> > > And "RefuseManualStart=true" to corosync.service?
> > 
> > I would say short answer is no, but I would like to hear what is
> > the 
> > main idea for this proposal.
> 
> It's more about out of box user experience to guide the users of the 
> most use cases in the field to manage the whole cluster stack in the 
> appropriate steps, namely:
> 
> - To start stack: systemctl start pacemaker corosync-qdevice
> - To stop stack: systemctl stop corosync.service
> 
> and less error prone assumptions:
> 
> With "RefuseManualStop=true" to pacemaker.service, sometimes(if not
> often),
> 
> - it prevents the wrong assumption/wish/impression to stop the
>whole cluster together with corosync
> 
> - it prevents users forget one more step to stop corosync indeed
> 
> - it prevents some ISV do create disruptive scripts only stop
> pacemaker 
> and forget others.
> 
> - Being rejected at the first place, then naturally guide users to
> run 
> `systemctl stop corosync.service`
> 
> 
> And extends the same idea a little further to
> 
> - "RefuseManualStop=true" to corosync-qdevice.service
> - and "RefuseManualStart=true" to corosync.service

This definitely can be a pain point for users, but I think the higher-
level tools (crm, pcs, hawk, etc.) are a better place to do this. At
the individual project level, it's possible to run corosync alone
(rare, but I have seen messages on this list by users who do) and that
can be useful for testing as well.

The higher-level tools exist to hide the complexity from the end user,
and they can coordinate multiple pieces like booth, qdevice-only nodes,
etc. As time goes on, it seems like there are more and more such pieces
-- single-host native facilities like systemd probably won't ever be
able to grasp the entire puzzle.

As an example, newer pcs versions start corosync on all nodes first,
then pacemaker on all nodes, so that if there's a quorum, it's already
available when pacemaker starts. There's no way to do such multi-host
dependencies in systemd.

The documentation could be improved too, for users who do want a lower-
level view.

> 
> Well, I do feel corosync* are less error prone as pacemaker in this
> regards.
> 
> Thanks,
> Roger
> 
> 
> > 
> > Regards,
> >Honza
> > 
> > > 
> > > @Jan, @Ken
> > > 
> > > What do you think?
> > > 
> > > Cheers,
> > > Roger
> > > 
> > > 
> > > > 
> > > > `systemctl stop corosync.service` is the right command to stop
> > > > those
> > > > cluster stack.
> > > > 
> > > > It stops pacemaker and corosync-qdevice first, and stop SBD
> > > > too.
> > > > 
> > > > pacemaker.service: After=corosync.service
> > > > corosync-qdevice.service: After=corosync.service
> > > > sbd.service: PartOf=corosync.service
> > > > 
> > > > On the reverse side, to start the cluster stack, use
> > > > 
> > > > systemctl start pacemaker.service corosync-qdevice
> > > > 
> > > > It is slightly confusing from the impression. So, openSUSE uses
> > > > the
> > > > consistent commands as below:
> > > > 
> > > > crm cluster start
> > > > crm cluster stop
> > > > 
> > > > Cheers,
> > > > Roger
> > > > 
> > > > > Unfortunately, corosync-qdevice.service declares
> > > > > Requires=corosync.service and corosync-qdevice.service itself
> > > > > is *not*
> > > > > stopped when pacemaker.service is stopped. Which means
> > > > > corosync.service
> > > > > remains "needed" and is never stopped.
> > > > > 
> > > > > Also sbd.service (which is PartOf=corosync.service) remains
> > > > > running 
> > > > > as well.
> > > > > 
> > > > > The latter is really bad, as it means sbd watchdog can kick
> > > > > in at any
> > > > > time when user believes cluster stack is safely stopped. In
> > > > > particular
> > > > > if qnetd is not accessible (think network reconfiguration).

-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] pacemaker alerts list

2019-07-17 Thread Ken Gaillot
On Tue, 2019-07-16 at 13:53 +, Gershman, Vladimir wrote:
> Hi,
>  
> Is there a list of all possible alerts/events that Peacemaker can
> send out? Preferable with criticality levels for the alerts (minor,
> major, critical).

I'm not sure whether you're using "alerts" in a general sense here, or
specifically about Pacemaker's alert configuration:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#idm140043888101536

If the latter, the "Writing an Alert Agent" section of that link lists
all the possible alert types. The criticality would be derived from
CRM_alert_desc, CRM_alert_rc, and CRM_alert_status.


>  
>  
>  
> Thank you,
> 
> Vlad
> Equipment Management (EM) System Engineer
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Which shell definitions to include?

2019-07-23 Thread Ken Gaillot
On Tue, 2019-07-23 at 08:48 +0200, Ulrich Windl wrote:
> Hi!
> 
> My old RAs include /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs. 
> Recently I discovered /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs.
> (I wonder whether the latter shouldn't be /usr/lib/ocf/ocf-shellfuncs 
> simply)
> 
> So what is the recommended path to use for include? If RAs continue
> to use the old includes, we'll never get rid of the historic paths.

I believe the "lib/heartbeat" location is the newer one, and the dot
file is there for backward compatibility.

> BTW: It's all a bit of a mess, because
> /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs includes parts from
> /usr/lib/ocf/lib/heartbeat (${OCF_ROOT}/lib/heartbeat), actually
> being a symbolic link to /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs.
> 
> In .ocf-directories there are definitions that systemd make somewhat
> obsolete like:
> HA_VARRUN (/var/run)
> HA_VARLOCK (/var/lock/subsys)
> 
> For one directory there is an updated definition:
> HA_RSCTMP (/run/resource-agents)
> HA_RSCTMP_OLD (/var/run/heartbeat/rsctmp)
> 
> It's not that I like all the changes systemd requires, but systemd
> complains about not being able to unmount /var while /var/run or
> /var/lock is being used...

Agreed, it should be a build-time option whether to use /run or
/var/run
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: Re: Resource won't start, crm_resource -Y does not help

2019-07-23 Thread Ken Gaillot
On Tue, 2019-07-23 at 07:57 +0200, Ulrich Windl wrote:
> > > > Ken Gaillot  schrieb am 22.07.2019 um
> > > > 18:14 in Nachricht
> 
> :
> > On Mon, 2019-07-22 at 15:45 +0200, Ulrich Windl wrote:
> > > Hi!
> > > 
> > > My RA actually sends OCF_ERR_ARGS if checking the args detects a
> > > problem.
> > > But as the error can be resolved sometimes without changing the
> > > args
> > > (eg.
> > > providing some resource by other means), I suspect CRM does not
> > > handle that
> > > properly. Even after a resource cleanup.
> > > 
> > > My RA logs any parameter check, and I can see that no parameter
> > > check
> > > is being
> > > performed...
> > > 
> > > I also noticed that the "invalid parameter" persists on a node
> > > even
> > > after
> > > restarting pacemaker on that node.
> > 
> > Pacemaker treats OCF_ERR_ARGS as a "hard" failure, meaning it won't
> > be
> > retried on the same node. But it should attempt to start on any
> > other
> > eligible nodes.
> 
> This makes _some_ sense: If the parameters are unacceptable
> (OCF_ERR_ARGS) it really makes no sense to retry (Like havinf
> specified a host name that does not exist).
> However there are _two_ events that may change the state:
> 
> 1) If the parameters (e.g. hostname) is changed
> 
> 2) If the configuration outside the cluster was changed (e.g. making
> the hostname valid now)
> 
> In thge light of 2) I don't really see why a resource cleanup really
> does not reset the error condition. That is really unexpected.
> 
> > 
> > The failure should be cleared by either cleanup or pacemaker
> > restart.
> 
> According to my impression a cleanup did not change the condition but
> a cluster node restart did.

If a cleanup doesn't take care of it, something's going wrong.

> 
> > That's the mystery here. I can't even imagine how it would be
> > possible
> > to survive a pacemaker restart -- are you sure it wasn't simply a
> > new
> > attempt getting the same result?
> 
> According to the logs of my RA there were less parameter checks than
> expected, and the only explanation to me was that the result was
> cached somewhere.
> 
> 
> > 
> > > 
> > > So:
> > > # crm_resource -r prm_idredir_test -VV start
> > >  warning: unpack_rsc_op_failure:Processing failed start
> > > of
> > > prm_idredir_test on h02: invalid parameter | rc=2
> > > 
> > > (Start was not even tried)
> > > 
> > > Eventually I was able to start the resource. Some other process
> > > had a
> > > socket
> > > address in use my resource needed...
> > 
> > Since you control the RA, you might want to set exit reasons, which
> > will be shown in the status display (the exitreason='' in your
> > output
> > below). There's an ocf_exit_reason convenience function, e.g.
> > 
> >ocf_exit_reason "Some other process has the socket address in
> > use"
> >exit $OCF_ERR_ARGS
> 
> Oh, this must be rather new ;-)
> 
> Since when is that available?
> 
> Regards,
> Ulrich

If you consider 2014 new :)

Of course it always takes a little longer to find its way into
distributions.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [ClusterLabs Developers] pacemaker geo redundancy - 2 nodes

2019-07-23 Thread Ken Gaillot
Hi Rohit,

Did you see the two earlier replies?

https://lists.clusterlabs.org/pipermail/users/2019-July/026035.html

On Tue, 2019-07-23 at 11:14 +0530, Rohit Saini wrote:
> Gentle Reminder!!
> 
> On Wed, Jul 17, 2019 at 1:14 PM Rohit Saini <
> rohitsaini111.fo...@gmail.com> wrote:
> > Gentle Reminder!!
> > 
> > On Mon, Jul 15, 2019 at 12:10 PM Rohit Saini <
> > rohitsaini111.fo...@gmail.com> wrote:
> > > Hi All,
> > > 
> > > I know pacemaker booth is being used for geographical redundancy.
> > > Currently I am using pacemaker/corosync for my local two-node
> > > redundancy.
> > > As I understand, booth needs atleast 3 nodes to work correctly to
> > > do the automatic failovers. So it does not fit my requirements.
> > > 
> > > Few queries are
> > > 1) Can I make use of my current pacemaker/corosync to make it
> > > work across my two geographical nodes for automatic failovers,
> > > considering I am ready to ignore split-brain scenarios. I may
> > > need to tweak some timers I believe. Is this approach possible?
> > > 2) Any disadvantages of going this way?
> > > 
> > > 
> > > Thanks,
> > > Rohit
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/developers
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] On the semantics of ocf_exit_reason()

2019-07-23 Thread Ken Gaillot
On Tue, 2019-07-23 at 08:17 +0200, Ulrich Windl wrote:
> Hi!
> 
> As suggested I'm considering to replace all "ocf_log err ..."
> preceeding an error exit code with "ocf_exit_reason ..." in my OCF
> RA.
> However I have a question: Is it OK to call ocf_exit_reason more than
> once before actually exiting? I assume the last message being used
> will be the one displayed as reason then.

Correct. (I had to check -- ocf_exit_reason will print the message to
stderr and log it, and pacemaker will parse stderr for the last
occurrence.)

> My RA code checks multiple parameters, logging each error (not
> stopping at the first error if possible), like this:
> 
> ...
> if [ ! -x $isredir_bin ]; then
> ocf_log err "$me: missing $isredir_bin"
> result=$OCF_ERR_INSTALLED
> fi
> if [ "X${tag//[^-A-Za-z0-9._]/}" != "X${tag}" ]; then
> ocf_log err "$me: invalid value $tag for \"tag\""
> result=$OCF_ERR_ARGS
> fi
> if [ "X${backlog//[^0-9]/}" != "X${backlog}" ]; then
> ocf_log err "$me: invalid value $backlog for \"backlog\""
> result=$OCF_ERR_ARGS
> fi
> if [ -z "$dest_tsap" ]; then
> ocf_log err "$me: missing value for \"dest\""
> result=$OCF_ERR_ARGS
> elif [ "X${dest_tsap//[^-A-Za-z0-9._\\/]/}" != "X${dest_tsap}" ];
> then
> ocf_log err "$me: invalid value $dest_tsap for \"dest\""
> result=$OCF_ERR_ARGS
> fi
> ...
> 
> Regards,
> Ulrich
> 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] 2 nodes split brain with token timeout

2019-07-23 Thread Ken Gaillot
ig.totem.token_retransmits_before_loss_const (u32) = 4
> runtime.config.totem.window_size (u32) = 50
> totem.cluster_name (str) = FRPLZABPXY
> totem.interface.0.bindnetaddr (str) = 10.XX.YY.2
> totem.interface.0.broadcast (str) = yes
> totem.interface.0.mcastport (u16) = 5405
> totem.secauth (str) = off
> totem.totem (str) = 4000
> totem.transport (str) = udpu
> totem.version (u32) = 2
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] pacemaker geo redundancy - 2 nodes

2019-07-15 Thread Ken Gaillot
On Mon, 2019-07-15 at 12:10 +0530, Rohit Saini wrote:
> Hi All,
> 
> I know pacemaker booth is being used for geographical redundancy.
> Currently I am using pacemaker/corosync for my local two-node
> redundancy.
> As I understand, booth needs atleast 3 nodes to work correctly to do
> the automatic failovers. So it does not fit my requirements.

Booth needs a third site, but it doesn't need to be a cluster node. It
can be a lightweight host running just the booth arbitrator.

However you do need full clusters at each site, so at least two nodes
at each site, plus the arbitrator host.

> Few queries are
> 1) Can I make use of my current pacemaker/corosync to make it work
> across my two geographical nodes for automatic failovers, considering
> I am ready to ignore split-brain scenarios. I may need to tweak some
> timers I believe. Is this approach possible?

Yes, this is sometimes referred to as a "stretched" or "metro" cluster.
You can raise the corosync token timeout as needed to cover typical
latencies. However this is generally only recommended when the
connection between the two sites is highly reliable and low latency.

A lightweight host at a third site running qdevice can be used to
provide true quorum.

> 2) Any disadvantages of going this way?

Raising the token timeout will delay the response to actual
node/network failures by the same amount.

If you're thinking of doing it without fencing, the consequences of
split brain depend on your workload. Something like a database or
cluster filesystem could become horribly corrupted.

> 
> 
> Thanks,
> Rohit
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Feedback wanted: Node reaction to fabric fencing

2019-07-24 Thread Ken Gaillot
Hi all,

A recent bugfix (clbz#5386) brings up a question.

A node may receive notification of its own fencing when fencing is
misconfigured (for example, an APC switch with the wrong plug number)
or when fabric fencing is used that doesn't cut the cluster network
(for example, fence_scsi).

Previously, the *intended* behavior was for the node to attempt to
reboot itself in that situation, falling back to stopping pacemaker if
that failed. However, due to the bug, the reboot always failed, so the
behavior effectively was to stop pacemaker.

Now that the bug is fixed, the node will indeed reboot in that
situation.

It occurred to me that some users configure fabric fencing specifically
so that nodes aren't ever intentionally rebooted. Therefore, I intend
to make this behavior configurable.

My question is, what do you think the default should be?

1. Default to the correct behavior (reboot)

2. Default to the current behavior (stop)

3. Default to the current behavior for now, and change it to the
correct behavior whenever pacemaker 2.1 is released (probably a few
years from now)
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Adding HAProxy as a Resource

2019-07-25 Thread Ken Gaillot
On Thu, 2019-07-25 at 07:23 +, Somanath Jeeva wrote:
> Hi 
> 
> Systemd unit file is available for haproxy but the pcs resource
> standard command does not list systemd standard .
> 
> Also I am not using the pacemaker packages from redhat. I am using
> the packages downloaded from clusterlabs.

Hi Somanath,

Which version of pacemaker are you using?

If you built it from source, did you give any options to the configure
command?

> 
> 
> 
> With Regards
> Somanath Thilak J
> 
> -Original Message-
> From: Tomas Jelinek  
> Sent: Monday, July 15, 2019 5:58 PM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> 
> Hi,
> 
> Do you have a systemd unit file for haproxy installed?
> Does 'crm_resource --list-standards' print 'systemd'?
> Does 'crm_resource --list-agents systemd' print 'haproxy'?
> Note that when you use full agent name (that is including : ) it is
> case sensitive in pcs.
> 
> Regards,
> Tomas
> 
> 
> Dne 11. 07. 19 v 10:14 Somanath Jeeva napsal(a):
> > Hi
> > 
> > I am using the resource agents built from clusterlabs and when I
> > add the systemd resource I am getting the below error .
> > 
> > $ sudo pcs resource create HAPROXY systemd:haproxy op monitor 
> > interval=2s
> > Error: Agent 'systemd:haproxy' is not installed or does not
> > provide 
> > valid metadata: Metadata query for systemd:haproxy failed: -22,
> > use 
> > --force to override
> > 
> > 
> > 
> > With Regards
> > Somanath Thilak J
> > 
> > -Original Message-
> > From: Kristoffer Grönlund 
> > Sent: Thursday, July 11, 2019 1:22 PM
> > To: Cluster Labs - All topics related to open-source clustering 
> > welcomed 
> > Cc: Somanath Jeeva 
> > Subject: Re: [ClusterLabs] Adding HAProxy as a Resource
> > 
> > On 2019-07-11 09:31, Somanath Jeeva wrote:
> > > Hi All,
> > > 
> > > I am using HAProxy in my environment  which I plan to add to 
> > > pacemaker as resource. I see no RA available for that in resource
> > > agent.
> > > 
> > > Should I write a new RA or is there any way to add it to
> > > pacemaker as 
> > > a systemd service.
> > 
> > Hello,
> > 
> > haproxy works well as a plain systemd service, so you can add it as
> > systemd:haproxy - that is, instead of an ocf: prefix, just put
> > systemd:.
> > 
> > If you want the cluster to manage multiple, differently configured
> > instances of haproxy, you might have to either create custom
> > systemd service scripts for each one, or create an agent with
> > parameters.
> > 
> > Cheers,
> > Kristoffer
> > 
> > > 
> > > 
> > > 
> > > With Regards
> > > Somanath Thilak J
> > > 
> > > 
> > > ___
> > > Manage your subscription:
> > > 
https://protect2.fireeye.com/url?k=28466b53-74926310-28462bc8-86a1150
> > > b 
> > > c3ba-
> > > bb674f3a9b557cbd=1=https%3A%2F%2Flists.clusterlabs.org%2Fmai
> > > l
> > > man%2Flistinfo%2Fusers
> > > 
> > > ClusterLabs home:
> > > 
https://protect2.fireeye.com/url?k=4c5edd73-108ad530-4c5e9de8-86a1150
> > > b c3ba-5da4e39ebe912cdf=1=https%3A%2F%2F
> > > www.clusterlabs.org%2F
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> > 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

<    6   7   8   9   10   11   12   13   14   15   >