[ClusterLabs] Antw: Is corosync supposed to be restarted if it fies?

2017-11-27 Thread Ulrich Windl



> In one of guides suggested procedure to simulate split brain was to kill
> corosync process. It actually worked on one cluster, but on another
> corosync process was restarted after being killed without cluster

Maybe try a "kill -STOP" instead... ;-)

> noticing anything. Except after several attempts pacemaker died with
> stopping resources ... :)
> 
> This is SLES12 SP2; I do not see any Restart in service definition so it
> probably not systemd.
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: SBD stonith in 2 node cluster - how to make it prefer one side of cluster?

2017-11-27 Thread Ulrich Windl
Hi!

With an sbd running on each node, I think it doesn't make a big differentce 
wich one was started first in cas eof a split brain: There is a chance that 
both nodes will kill each other.

I'd put my efforts into redundant reliable networking instead (MHO)...

Regards,
Ulrich



> Wrapping my head around how pcmk_delay_max works, my understanding is
> 
> - on startup pacemaker always starts one instance of stonith/sbd; it
> probably randomly selects node for it. I suppose this initial start is
> delayed by random number within pcmk_delay_max.
> 
> - when cluster is partitioned, pacemaker *also* starts one instance of
> stonith/sbd in each partition where it is not yet running. This startup
> is also delayed by random number within pcmk_delay_max.
> 
> - this makes partition that already has stonith/sbd running win race for
> kill request
> 
> Is my understanding correct?
> 
> If yes, consider two node cluster where one application is more
> important than the other. The obvious example is replicated database -
> in case of split brain we want to preserve node with primary as it
> likely has active connections.
> 
> Would using advisory colocation constraint between application and
> stonith/sbd work? Let's consider (using crmsh notation)
> 
> primitive my_database
> ms my_replicated_database my_database
> primitive fencing_sbd stonith:external/sbd params pcmk_delay_max=15
> colocation prefer_primary 10: fencing_sbd my_replicated_database:Master
> 
> It is going to work?
> 
> It should work on startup, as it simply affects where sbd resource is
> placed initially and pacemaker need to make this decision anyway.
> 
> I expect it to work if my_primary_database master moves to another node
> - pacemaker should move sbd resource too, right? It does add small
> window where no stonith agent is running, but as I understand pacemaker
> is going to start it anyway in case of split brain, so in the worst case
> non-preferred node will be fenced, which is not worse than what we have
> already.
> 
> What I am not sure is what happens during split brain. Will colocation
> affect pacemaker decision to start another copy of sbd resource on
> another partitioned node? I hope not, as it is advisory so it should
> still use the only available node left in this case?
> 
> Does it all make sense? Anyone has used it in real life?
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-27 Thread Ferenc Wágner
Andrei Borzenkov  writes:

> 25.11.2017 10:05, Andrei Borzenkov пишет:
>
>> In one of guides suggested procedure to simulate split brain was to kill
>> corosync process. It actually worked on one cluster, but on another
>> corosync process was restarted after being killed without cluster
>> noticing anything. Except after several attempts pacemaker died with
>> stopping resources ... :)
>> 
>> This is SLES12 SP2; I do not see any Restart in service definition so it
>> probably not systemd.
>> 
> FTR - it was not corosync, but pacemakker; its unit file specifies
> RestartOn=error so killing corosync caused pacemaker to fail and be
> restarted by systemd.

And starting corosync via a Requires dependency?
-- 
Feri

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Pacemaker resource is not tried to be recovered after failure on slave node even when failcount is less than migration-threshold

2017-11-27 Thread Pankaj
Hi,

Could you please help me with below query.

I have a stateful resource, stateful_ms, defined as below. The
migration-threshold is defined as 4 and resource-stickiness as 100.
I have pacemaker cluster of 5 nodes. First time resource(stateful_ms) is up
as MASTER on node-0 and as slave in other nodes.
I made monitor of stateful_ms to fail on node-0. As expected, after
reaching failcount=4 on node-0, resource instance on node-1 is promoted to
be MASTER.
But, when monitor, of stateful_ms in node-1, is made to fail, it was seen
that resource on node-0 is promoted immediately even failcount was 1 on
node-1 against defined migration-threshold=4.

Could you please help me to understand:
1. Why resource instance on node-1 is not promoted again when its failcount
is still less than migration-threshold?
2. How we can make sure than first resource is tried to be recovered on
current node(where it failed), as per migration-threshold, then only any
other node instance is promoted?

Below are the setup details:
resource configure
crm configure primitive stateful_dummy ocf:pacemaker:Stateful op monitor
interval="5" role="Master" timeout="20" op monitor interval="10"
role="Slave" timeout="20" meta resource-stickiness="100"
crm configure ms stateful_ms stateful_dummy meta resource-stickiness="100"
notify="true" master-max="1" interleave="true" migration-threshold=4
failure-timeout=60

# crm status
Stack: corosync
Current DC: NODE-0 (version 1.1.16-94ff4df51a) - partition with quorum
Last updated: Tue Nov 14 12:54:52 2017
Last change: Tue Nov 14 12:54:43 2017 by root via cibadmin on NODE-1

5 nodes configured

 Master/Slave Set: stateful_ms [stateful_dummy]
 Masters: [ NODE-0 ]
 Slaves: [ NODE-1 NODE-2 NODE-3 NODE-4 ]


# crm configure show
node 1: NODE-0
node 2: NODE-1
node 3: NODE-2
node 4: NODE-3
node 5: NODE-4

primitive stateful_dummy ocf:pacemaker:Stateful \
op monitor interval=5 role=Master timeout=20 \
op monitor interval=10 role=Slave timeout=20 \
meta resource-stickiness=100
ms stateful_ms stateful_dummy \
meta resource-stickiness=100 notify=true master-max=1
interleave=true migration-threshold=4 failure-timeout=60 target-role=Started


#crm resource score
Allocation scores and utilization information:
Original: NODE-0 capacity:
Original: NODE-1 capacity:
Original: NODE-2 capacity:
Original: NODE-3 capacity:
Original: NODE-4 capacity:
clone_color: stateful_ms allocation score on NODE-0: 0
clone_color: stateful_ms allocation score on NODE-1: 0
clone_color: stateful_ms allocation score on NODE-2: 0
clone_color: stateful_ms allocation score on NODE-3: 0
clone_color: stateful_ms allocation score on NODE-4: 0
clone_color: stateful_dummy:0 allocation score on NODE-0: 110
clone_color: stateful_dummy:0 allocation score on NODE-1: 0
clone_color: stateful_dummy:0 allocation score on NODE-2: 0
clone_color: stateful_dummy:0 allocation score on NODE-3: 0
clone_color: stateful_dummy:0 allocation score on NODE-4: 0
clone_color: stateful_dummy:1 allocation score on NODE-0: 0
clone_color: stateful_dummy:1 allocation score on NODE-1: 105
clone_color: stateful_dummy:1 allocation score on NODE-2: 0
clone_color: stateful_dummy:1 allocation score on NODE-3: 0
clone_color: stateful_dummy:1 allocation score on NODE-4: 0
clone_color: stateful_dummy:2 allocation score on NODE-0: 0
clone_color: stateful_dummy:2 allocation score on NODE-1: 0
clone_color: stateful_dummy:2 allocation score on NODE-2: 105
clone_color: stateful_dummy:2 allocation score on NODE-3: 0
clone_color: stateful_dummy:2 allocation score on NODE-4: 0
clone_color: stateful_dummy:3 allocation score on NODE-0: 0
clone_color: stateful_dummy:3 allocation score on NODE-1: 0
clone_color: stateful_dummy:3 allocation score on NODE-2: 0
clone_color: stateful_dummy:3 allocation score on NODE-3: 105
clone_color: stateful_dummy:3 allocation score on NODE-4: 0
clone_color: stateful_dummy:4 allocation score on NODE-0: 0
clone_color: stateful_dummy:4 allocation score on NODE-1: 0
clone_color: stateful_dummy:4 allocation score on NODE-2: 0
clone_color: stateful_dummy:4 allocation score on NODE-3: 0
clone_color: stateful_dummy:4 allocation score on NODE-4: 105
native_color: stateful_dummy:2 allocation score on NODE-0: 0
native_color: stateful_dummy:2 allocation score on NODE-1: 0
native_color: stateful_dummy:2 allocation score on NODE-2: 105
native_color: stateful_dummy:2 allocation score on NODE-3: 0
native_color: stateful_dummy:2 allocation score on NODE-4: 0
native_assign_node: stateful_dummy:2 utilization on NODE-2:
native_color: stateful_dummy:4 allocation score on NODE-0: 0
native_color: stateful_dummy:4 allocation score on NODE-1: 0
native_color: stateful_dummy:4 allocation score on NODE-2: -INFINITY
native_color: stateful_dummy:4 allocation score on NODE-3: 0
native_color: stateful_dummy:4 allocation score on NODE-4: 105
native_assign_node: stateful_dummy:4 utilization on NODE-4:
native_color: stateful_dummy:1 allocation score on NODE-0: 0

Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-27 Thread Andrei Borzenkov


Отправлено с iPhone

> 27 нояб. 2017 г., в 14:36, Ferenc Wágner  написал(а):
> 
> Andrei Borzenkov  writes:
> 
>> 25.11.2017 10:05, Andrei Borzenkov пишет:
>> 
>>> In one of guides suggested procedure to simulate split brain was to kill
>>> corosync process. It actually worked on one cluster, but on another
>>> corosync process was restarted after being killed without cluster
>>> noticing anything. Except after several attempts pacemaker died with
>>> stopping resources ... :)
>>> 
>>> This is SLES12 SP2; I do not see any Restart in service definition so it
>>> probably not systemd.
>>> 
>> FTR - it was not corosync, but pacemakker; its unit file specifies
>> RestartOn=error so killing corosync caused pacemaker to fail and be
>> restarted by systemd.
> 
> And starting corosync via a Requires dependency?

Exactly.


> -- 
> Feri
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcs create master/slave resource doesn't work

2017-11-27 Thread Ken Gaillot
On Fri, 2017-11-24 at 18:00 +0800, Hui Xiang wrote:
> Jan,
> 
>   Very appreciated on your help, I am getting further more, but still
> it looks very strange.
> 
> 1. To use "debug-promote", I upgrade pacemaker from 1.12 to 1.16, pcs
> to 0.9.160.
> 
> 2. Recreate resource with below commands
> pcs resource create ovndb_servers ocf:ovn:ovndb-servers \
>   master_ip=192.168.0.99 \
>   op monitor interval="10s" \
>   op monitor interval="11s" role=Master
> pcs resource master ovndb_servers-master ovndb_servers \
>   meta notify="true" master-max="1" master-node-max="1" clone-max="3" 
> clone-node-max="1"
> pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.99 \
>     op monitor interval=10s
> pcs constraint colocation add VirtualIP with master ovndb_servers-
> master \
>   score=INFINITY
> 
> 3. pcs status
>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>      Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld 
> ]
>  VirtualIP(ocf::heartbeat:IPaddr2):   Stopped
> 
> 4. Manually run 'debug-start' on 3 nodes and 'debug-promote' on one
> of nodes
> run below on [ node-1.domain.tld node-2.domain.tld node-3.domain.tld
> ]
> # pcs resource debug-start ovndb_servers --full
> run below on [ node-1.domain.tld ]
> # pcs resource debug-promote ovndb_servers --full

Before running debug-* commands, I'd unmanage the resource or put the
cluster in maintenance mode, so Pacemaker doesn't try to "correct" your
actions.

> 
> 5. pcs status
>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>      Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld 
> ]
>  VirtualIP(ocf::heartbeat:IPaddr2):   Stopped
> 
> 6. However I have seen that one of ovndb_servers has been indeed
> promoted as master, but pcs status still showed all 'stopped'
> what am I missing?

It's hard to tell from these logs. It's possible the resource agent's
monitor command is not exiting with the expected status values:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemake
r_Explained/index.html#_requirements_for_multi_state_resource_agents

One of the nodes will be elected the DC, meaning it coordinates the
cluster's actions. The DC's logs will have more "pengine:" messages,
with each action that needs to be taken (e.g. "* Start  ").

You can look through those actions to see what the cluster decided to
do -- whether the resources were ever started, whether any was
promoted, and whether any were explicitly stopped.


>  >  stderr: + 17:45:59: ocf_log:327: __OCF_MSG='ovndb_servers:
> Promoting node-1.domain.tld as the master'
>  >  stderr: + 17:45:59: ocf_log:329: case "${__OCF_PRIO}" in
>  >  stderr: + 17:45:59: ocf_log:333: __OCF_PRIO=INFO
>  >  stderr: + 17:45:59: ocf_log:338: '[' INFO = DEBUG ']'
>  >  stderr: + 17:45:59: ocf_log:341: ha_log 'INFO: ovndb_servers:
> Promoting node-1.domain.tld as the master'
>  >  stderr: + 17:45:59: ha_log:253: __ha_log 'INFO: ovndb_servers:
> Promoting node-1.domain.tld as the master'
>  >  stderr: + 17:45:59: __ha_log:185: local ignore_stderr=false
>  >  stderr: + 17:45:59: __ha_log:186: local loglevel
>  >  stderr: + 17:45:59: __ha_log:188: '[' 'xINFO: ovndb_servers:
> Promoting node-1.domain.tld as the master' = x--ignore-stderr ']'
>  >  stderr: + 17:45:59: __ha_log:190: '[' none = '' ']'
>  >  stderr: + 17:45:59: __ha_log:192: tty
>  >  stderr: + 17:45:59: __ha_log:193: '[' x = x0 -a x = xdebug ']'
>  >  stderr: + 17:45:59: __ha_log:195: '[' false = true ']'
>  >  stderr: + 17:45:59: __ha_log:199: '[' '' ']'
>  >  stderr: + 17:45:59: __ha_log:202: echo 'INFO: ovndb_servers:
> Promoting node-1.domain.tld as the master'
>  >  stderr: INFO: ovndb_servers: Promoting node-1.domain.tld as the
> master 
>  >  stderr: + 17:45:59: __ha_log:204: return 0
>  >  stderr: + 17:45:59: ovsdb_server_promote:378:
> /usr/sbin/crm_attribute --type crm_config --name OVN_REPL_INFO -s
> ovn_ovsdb_master_server -v node-1.domain.tld
>  >  stderr: + 17:45:59: ovsdb_server_promote:379:
> ovsdb_server_master_update 8
>  >  stderr: + 17:45:59: ovsdb_server_master_update:214: case $1 in
>  >  stderr: + 17:45:59: ovsdb_server_master_update:218:
> /usr/sbin/crm_master -l reboot -v 10
>  >  stderr: + 17:45:59: ovsdb_server_promote:380: return 0
>  >  stderr: + 17:45:59: 458: rc=0
>  >  stderr: + 17:45:59: 459: exit 0
> 
> 
> On 23/11/17 23:52 +0800, Hui Xiang wrote:
> > I am working on HA with 3-nodes, which has below configurations:
> > 
> > """
> > pcs resource create ovndb_servers ocf:ovn:ovndb-servers \
> >   master_ip=168.254.101.2 \
> >   op monitor interval="10s" \
> >   op monitor interval="11s" role=Master
> > pcs resource master ovndb_servers-master ovndb_servers \
> >   meta notify="true" master-max="1" master-node-max="1" clone-
> max="3"
> > clone-node-max="1"
> > pcs resource create VirtualIP ocf:heartbeat:IPaddr2
> ip=168.254.101.2 \
> > op monitor interval=10s
> > pcs constraint order promote ovndb_servers-master then VirtualIP
> > pcs con

Re: [ClusterLabs] Low priority colocation

2017-11-27 Thread Ken Gaillot
On Fri, 2017-11-24 at 11:32 +, Alan Birtles wrote:
> Is it possible to setup pacemaker with a resource which will only run
> on the same machine as another resource but that will not trigger a
> failover of the second resource if the first resource becomes
> unrunnable?
> e.g.
> I have 2 resources A and B. Resource A is the main resource and B is
> some monitoring resource which should only be run on the same machine
> as A but if B fails to run on a machine it shouldn't trigger a
> movement of A to another machine. If A fails on a machine it should
> move to another machine and then B should also move to that machine.
> I've tried setting a low score on the colocation constraint which
> does stop A from moving when B fails but this then allows B to start
> on a machine which isn't running A.

Pacemaker will place A on a node first, but it will take B's
preferences into account due to the colocation. If B becomes
unrunnable, its resulting aversion to the current node can cause A to
move.

There's no direct way to say what you want, but I can think of a few
ways to make it happen:

* If you set A's stickiness to INFINITY, that should override any
preference from B. I'd set A's priority meta-attribute higher than B's
as well.

* If you set on-fail=stop for B's monitor, the cluster will stop B
rather than try to restart and/or move it if it fails.

* The ocf:pacemaker:attribute agent, which sets an arbitrary node
attribute to a particular value depending on whether it is started or
stopped, was added for cases like this. You can create a group of A
plus the attribute, then create a location constraint with a rule
allowing B to run only where the attribute is set as started. This way,
A is unaware of the relationship and ignores B's preferences.
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcs create master/slave resource doesn't work

2017-11-27 Thread Jan Pokorný
On 27/11/17 12:07 -0600, Ken Gaillot wrote:
> On Fri, 2017-11-24 at 18:00 +0800, Hui Xiang wrote:
>>   Very appreciated on your help, I am getting further more, but still
>> it looks very strange.
>> 
>> 1. To use "debug-promote", I upgrade pacemaker from 1.12 to 1.16, pcs
>> to 0.9.160.
>> 
>> 2. Recreate resource with below commands
>> pcs resource create ovndb_servers ocf:ovn:ovndb-servers \
>>   master_ip=192.168.0.99 \
>>   op monitor interval="10s" \
>>   op monitor interval="11s" role=Master
>> pcs resource master ovndb_servers-master ovndb_servers \
>>   meta notify="true" master-max="1" master-node-max="1" clone-max="3" 
>> clone-node-max="1"
>> pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.99 \
>>     op monitor interval=10s
>> pcs constraint colocation add VirtualIP with master ovndb_servers-
>> master \
>>   score=INFINITY
>> 
>> 3. pcs status
>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>      Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld 
>> ]
>>  VirtualIP   (ocf::heartbeat:IPaddr2):   Stopped
>> 
>> 4. Manually run 'debug-start' on 3 nodes and 'debug-promote' on one
>> of nodes
>> run below on [ node-1.domain.tld node-2.domain.tld node-3.domain.tld
>> ]
>> # pcs resource debug-start ovndb_servers --full
>> run below on [ node-1.domain.tld ]
>> # pcs resource debug-promote ovndb_servers --full
> 
> Before running debug-* commands, I'd unmanage the resource or put the
> cluster in maintenance mode, so Pacemaker doesn't try to "correct" your
> actions.

Agree for that being a rule of thumb for avoiding surprises, though in
this particular case, it did not appear there's anything to correct.

Would it then make sense for the pcs (high-level tool) to check the
circumstances, especially concurrent run of the resource, prior to
delegating the command down the line?

-- 
Jan (Poki)


pgp3o6VBrUhaO.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker responsible of DRBD and a systemd resource

2017-11-27 Thread Ken Gaillot
On Mon, 2017-11-13 at 10:24 -0500, Derek Wuelfrath wrote:
> Hello Ken !
> 
> > Make sure that the systemd service is not enabled. If pacemaker is
> > managing a service, systemd can't also be trying to start and stop
> > it.
> 
> It is not. I made sure of this in the first place :)
> 
> > Beyond that, the question is what log messages are there from
> > around
> > the time of the issue (on both nodes).
> 
> Well, that’s the thing. There is not much log messages telling what
> is actually happening. The ’systemd’ resource is not even trying to
> start (nothing in either log for that resource). Here are the logs
> from my last attempt:
> Scenario:
> - Services were running on ‘pancakeFence2’. DRBD was synced and
> connected
> - I rebooted ‘pancakeFence2’. Services failed to ‘pancakeFence1’
> - After ‘pancakeFence2’ comes back, services are running just fine on
> ‘pancakeFence1’ but DRBD is in Standalone due to split-brain
> 
> Logs for pancakeFence1: https://pastebin.com/dVSGPP78
> Logs for pancakeFence2: https://pastebin.com/at8qPkHE

When you say you rebooted the node, was it a clean reboot or a
simulated failure like power-off or kernel-panic? If it was a simulated
failure, then the behavior makes sense in this case. If a node
disappears for no known reason, DRBD ends up in split-brain. If fencing
were configured, the surviving node would fence the other one to be
sure it's down, but it might still be unable to reconnect to DRBD
without manual intervention.

The systemd issue is separate, and I can't think of what would cause
it. If you have PCMK_logfile set in /etc/sysconfig/pacemaker, you will
get more extensive log messages there. One node will be elected DC and
will have more "pengine:" messages than the other, that will show all
the decisions made about what actions to take, and the results of those
actions.

> It really looks like the status checkup mechanism of
> corosync/pacemaker for a systemd resource force the resource to
> “start” and therefore, start the ones above that resource in the
> group (DRBD in instance).
> This does not happen for a regular OCF resource (IPaddr2 per example)
> 
> Cheers!
> -dw
> 
> --
> Derek Wuelfrath
> dwuelfr...@inverse.ca :: +1.514.447.4918 (x110) :: +1.866.353.6153
> (x110)
> Inverse inc. :: Leaders behind SOGo (www.sogo.nu), PacketFence
> (www.packetfence.org) and Fingerbank (www.fingerbank.org)
> 
> > On Nov 10, 2017, at 11:39, Ken Gaillot  wrote:
> > 
> > On Thu, 2017-11-09 at 20:27 -0500, Derek Wuelfrath wrote:
> > > Hello there,
> > > 
> > > First post here but following since a while!
> > 
> > Welcome!
> > 
> > > Here’s my issue,
> > > we are putting in place and running this type of cluster since a
> > > while and never really encountered this kind of problem.
> > > 
> > > I recently set up a Corosync / Pacemaker / PCS cluster to manage
> > > DRBD
> > > along with different other resources. Part of theses resources
> > > are
> > > some systemd resources… this is the part where things are
> > > “breaking”.
> > > 
> > > Having a two servers cluster running only DRBD or DRBD with an
> > > OCF
> > > ipaddr2 resource (Cluser IP in instance) works just fine. I can
> > > easily move from one node to the other without any issue.
> > > As soon as I add a systemd resource to the resource group, things
> > > are
> > > breaking. Moving from one node to the other using standby mode
> > > works
> > > just fine but as soon as Corosync / Pacemaker restart involves
> > > polling of a systemd resource, it seems like it is trying to
> > > start
> > > the whole resource group and therefore, create a split-brain of
> > > the
> > > DRBD resource.
> > 
> > My first two suggestions would be:
> > 
> > Make sure that the systemd service is not enabled. If pacemaker is
> > managing a service, systemd can't also be trying to start and stop
> > it.
> > 
> > Fencing is the only way pacemaker can resolve split-brains and
> > certain
> > other situations, so that will help in the recovery.
> > 
> > Beyond that, the question is what log messages are there from
> > around
> > the time of the issue (on both nodes).
> > 
> > 
> > > It is the best explanation / description of the situation that I
> > > can
> > > give. If it need any clarification, examples, … I am more than
> > > open
> > > to share them.
> > > 
> > > Any guidance would be appreciated :)
> > > 
> > > Here’s the output of a ‘pcs config’
> > > 
> > > https://pastebin.com/1TUvZ4X9
> > > 
> > > Cheers!
> > > -dw
> > > 
> > > --
> > > Derek Wuelfrath
> > > dwuelfr...@inverse.ca :: +1.514.447.4918 (x110) ::
> > > +1.866.353.6153
> > > (x110)
> > > Inverse inc. :: Leaders behind SOGo (www.sogo.nu), PacketFence
> > > (www.packetfence.org) and Fingerbank (www.fingerbank.org)
> > -- 
> > Ken Gaillot 
> > 
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://w

Re: [ClusterLabs] Pacemaker resource is not tried to be recovered after failure on slave node even when failcount is less than migration-threshold

2017-11-27 Thread Andrei Borzenkov


Отправлено с iPhone

> 27 нояб. 2017 г., в 14:50, Pankaj  написал(а):
> 
> Hi,
> 
> Could you please help me with below query.
> 
> I have a stateful resource, stateful_ms, defined as below. The 
> migration-threshold is defined as 4 and resource-stickiness as 100.
> I have pacemaker cluster of 5 nodes. First time resource(stateful_ms) is up 
> as MASTER on node-0 and as slave in other nodes.
> I made monitor of stateful_ms to fail on node-0. As expected, after reaching 
> failcount=4 on node-0, resource instance on node-1 is promoted to be MASTER.
> But, when monitor, of stateful_ms in node-1, is made to fail, it was seen 
> that resource on node-0 is promoted immediately even failcount was 1 on 
> node-1 against defined migration-threshold=4.
> 
> Could you please help me to understand:
> 1. Why resource instance on node-1 is not promoted again when its failcount 
> is still less than migration-threshold?
> 2. How we can make sure than first resource is tried to be recovered on 
> current node(where it failed), as per migration-threshold, then only any 
> other node instance is promoted?
> 
> Below are the setup details:
> resource configure
> crm configure primitive stateful_dummy ocf:pacemaker:Stateful op monitor 
> interval="5" role="Master" timeout="20" op monitor interval="10" role="Slave" 
> timeout="20" meta resource-stickiness="100"
> crm configure ms stateful_ms stateful_dummy meta resource-stickiness="100" 
> notify="true" master-max="1" interleave="true" migration-threshold=4 
> failure-timeout=60
> 

Most likely failure counter gets reset due to low failure timeout; then after 
master failure pacemaker first tries to demote it and ends up with two slaves 
which have equal master score. From here it can select any. Try with larger 
value for failure-timeout first.


> # crm status
> Stack: corosync
> Current DC: NODE-0 (version 1.1.16-94ff4df51a) - partition with quorum
> Last updated: Tue Nov 14 12:54:52 2017
> Last change: Tue Nov 14 12:54:43 2017 by root via cibadmin on NODE-1
> 
> 5 nodes configured
> 
>  Master/Slave Set: stateful_ms [stateful_dummy]
>  Masters: [ NODE-0 ]
>  Slaves: [ NODE-1 NODE-2 NODE-3 NODE-4 ]
> 
> 
> # crm configure show
> node 1: NODE-0
> node 2: NODE-1
> node 3: NODE-2
> node 4: NODE-3
> node 5: NODE-4
> 
> primitive stateful_dummy ocf:pacemaker:Stateful \
> op monitor interval=5 role=Master timeout=20 \
> op monitor interval=10 role=Slave timeout=20 \
> meta resource-stickiness=100
> ms stateful_ms stateful_dummy \
> meta resource-stickiness=100 notify=true master-max=1 interleave=true 
> migration-threshold=4 failure-timeout=60 target-role=Started
> 
> 
> #crm resource score
> Allocation scores and utilization information:
> Original: NODE-0 capacity:
> Original: NODE-1 capacity:
> Original: NODE-2 capacity:
> Original: NODE-3 capacity:
> Original: NODE-4 capacity:
> clone_color: stateful_ms allocation score on NODE-0: 0
> clone_color: stateful_ms allocation score on NODE-1: 0
> clone_color: stateful_ms allocation score on NODE-2: 0
> clone_color: stateful_ms allocation score on NODE-3: 0
> clone_color: stateful_ms allocation score on NODE-4: 0
> clone_color: stateful_dummy:0 allocation score on NODE-0: 110
> clone_color: stateful_dummy:0 allocation score on NODE-1: 0
> clone_color: stateful_dummy:0 allocation score on NODE-2: 0
> clone_color: stateful_dummy:0 allocation score on NODE-3: 0
> clone_color: stateful_dummy:0 allocation score on NODE-4: 0
> clone_color: stateful_dummy:1 allocation score on NODE-0: 0
> clone_color: stateful_dummy:1 allocation score on NODE-1: 105
> clone_color: stateful_dummy:1 allocation score on NODE-2: 0
> clone_color: stateful_dummy:1 allocation score on NODE-3: 0
> clone_color: stateful_dummy:1 allocation score on NODE-4: 0
> clone_color: stateful_dummy:2 allocation score on NODE-0: 0
> clone_color: stateful_dummy:2 allocation score on NODE-1: 0
> clone_color: stateful_dummy:2 allocation score on NODE-2: 105
> clone_color: stateful_dummy:2 allocation score on NODE-3: 0
> clone_color: stateful_dummy:2 allocation score on NODE-4: 0
> clone_color: stateful_dummy:3 allocation score on NODE-0: 0
> clone_color: stateful_dummy:3 allocation score on NODE-1: 0
> clone_color: stateful_dummy:3 allocation score on NODE-2: 0
> clone_color: stateful_dummy:3 allocation score on NODE-3: 105
> clone_color: stateful_dummy:3 allocation score on NODE-4: 0
> clone_color: stateful_dummy:4 allocation score on NODE-0: 0
> clone_color: stateful_dummy:4 allocation score on NODE-1: 0
> clone_color: stateful_dummy:4 allocation score on NODE-2: 0
> clone_color: stateful_dummy:4 allocation score on NODE-3: 0
> clone_color: stateful_dummy:4 allocation score on NODE-4: 105
> native_color: stateful_dummy:2 allocation score on NODE-0: 0
> native_color: stateful_dummy:2 allocation score on NODE-1: 0
> native_color: stateful_dummy:2 allocation score on NODE-2: 105
> native_color: stateful_dummy:2 allocation score on

[ClusterLabs] cluster with two ESX server

2017-11-27 Thread Ramann , Björn
hi@all,

in my configuration, the 1st Node run on ESX1, the second run on ESX2. Now I'm 
looking for a way to configure the cluster fence/stonith with two ESX server - 
is this possible?

I try to us  fence_vmware with vcenter, but then the vcenter is  a single point 
of failure und running two vcenter is current not possible.

Any ideas?

Thanks!


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org