Re: [ClusterLabs] Cluster node getting stopped from other node(resending mail)

2015-07-01 Thread Ken Gaillot
On 06/30/2015 11:30 PM, Arjun Pandey wrote:
 Hi
 
 I am running a 2 node cluster with this config on centos 6.5/6.6
 
 Master/Slave Set: foo-master [foo]
 Masters: [ messi ]
 Stopped: [ronaldo ]
  eth1-CP(ocf::pw:IPaddr):   Started messi
  eth2-UP(ocf::pw:IPaddr):   Started messi
  eth3-UPCP  (ocf::pw:IPaddr):   Started messi
 
 where i have a multi-state resource foo being run in master/slave mode and
  IPaddr RA is just modified IPAddr2 RA. Additionally i have a
 collocation constraint for the IP addr to be collocated with the master.
 
 Sometimes when i setup the cluster , i find that one of the nodes (the
 second node that joins ) gets stopped and i find this log.
 
 2015-06-01T13:55:46.153941+05:30 ronaldo pacemaker: Starting Pacemaker
 Cluster Manager
 2015-06-01T13:55:46.233639+05:30 ronaldo attrd[25988]:   notice:
 attrd_trigger_update: Sending flush op to all hosts for: shutdown (0)
 2015-06-01T13:55:46.234162+05:30 ronaldo crmd[25990]:   notice:
 do_state_transition: State transition S_PENDING - S_NOT_DC [
 input=I_NOT_DC cause=C_HA_MESSAG
 E origin=do_cl_join_finalize_respond ]
 2015-06-01T13:55:46.234701+05:30 ronaldo attrd[25988]:   notice:
 attrd_local_callback: Sending full refresh (origin=crmd)
 2015-06-01T13:55:46.234708+05:30 ronaldo attrd[25988]:   notice:
 attrd_trigger_update: Sending flush op to all hosts for: shutdown (0)
  This looks to be the likely
 reason***
 2015-06-01T13:55:46.254310+05:30 ronaldo crmd[25990]:error:
 handle_request: We didn't ask to be shut down, yet our DC is telling us too
 .
 *

Hi Arjun,

I'd check the other node's logs at this time, to see why it requested
the shutdown.

 2015-06-01T13:55:46.254577+05:30 ronaldo crmd[25990]:   notice:
 do_state_transition: State transition S_NOT_DC - S_STOPPING [ input=I_STOP
 cause=C_HA_MESSAGE
  origin=route_message ]
 2015-06-01T13:55:46.255134+05:30 ronaldo crmd[25990]:   notice:
 lrm_state_verify_stopped: Stopped 0 recurring operations at shutdown...
 waiting (2 ops remaining)
 
 Based on the logs , pacemaker on active was stopping the secondary cloud
 everytime it joins cluster. This issue seems similar to
 http://pacemaker.oss.clusterlabs.narkive.com/rVvN8May/node-sends-shutdown-request-to-other-node-error
 
 Packages used :-
 pacemaker-1.1.12-4.el6.x86_64
 pacemaker-libs-1.1.12-4.el6.x86_64
 pacemaker-cli-1.1.12-4.el6.x86_64
 pacemaker-cluster-libs-1.1.12-4.el6.x86_64
 pacemaker-debuginfo-1.1.10-14.el6.x86_64
 pcsc-lite-libs-1.5.2-13.el6_4.x86_64
 pcs-0.9.90-2.el6.centos.2.noarch
 pcsc-lite-1.5.2-13.el6_4.x86_64
 pcsc-lite-openct-0.6.19-4.el6.x86_64
 corosync-1.4.1-17.el6.x86_64
 corosynclib-1.4.1-17.el6.x86_64


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Resource stop when another resource run on that node

2015-07-01 Thread Ken Gaillot
On 07/01/2015 01:18 AM, John Gogu wrote:
 ​Hello,
 this is what i have setup but is now working 100%:
 
 Online: [ node01hb0 node02hb0 ]
 Full list of resources:
  IP1_Vir(ocf::heartbeat:IPaddr):Started node01hb0
  IP2_Vir(ocf::heartbeat:IPaddr):Started node02hb0
 
 
  default-resource-stickiness: 2000
 
 
 ​Location Constraints:
   Resource: IP1_Vir
 Enabled on: node01hb0 (score:1000)
 
   Resource: IP2_Vir
 Disabled on: node01hb0 (score:-INFINITY)
 
 Colocation Constraints:
   IP2_Vir with IP1_Vir (score:-INFINITY)
 
 ​When i move manual the resource ​IP1_Vir from node01hb0  node02hb0 all is
 fine, IP2_Vir is stopped.

That's what you asked it to do. :)

The -INFINITY constraint for IP2_Vir on node01hb0 means that IP2_Vir can
*never* run on that node. The -INFINITY constraint for IP2_Vir with
IP1_Vir means that IP2_Vir can *never* run on the same node as IP1_Vir.
So if IP1_Vir is on node02hb0, then IP2_Vir has nowhere to run.

If you want either node to be able to take over either IP when
necessary, you don't want any -INFINITY constraints. You can use a score
other than -INFINITY to give a preference instead of a requirement.

For example, if you want the IPs to run on different nodes whenever
possible, you could have a colocation constraint IP2_Vir with IP1_Vir
score -3000. Having the score more negative than the resource stickiness
means that when a failed node comes back up, one of the IPs will move to
it. If you don't want that, use a score less than your stickiness, such
as -100.

You probably don't want any location constraints, unless there's a
reason each IP should be on a specific node in normal operation.

 When i crash node node01hb0 / stop pacemaker  both resources are stopped.

This likely depends on your quorum and fencing configuration, and what
versions of software you're using.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread alex austin
so did another test:

two nodes: node1 and node2

Case: node1 is the active node
node2: is pasive

if I killall -9 pacemakerd corosync on node 1 the services do not fail over
to node2, but if I start corosync and pacemaker on node1 then it fails over
to node 2.

Where am I mistaking?

Alex

On Wed, Jul 1, 2015 at 12:42 PM, alex austin alexixa...@gmail.com wrote:

 Hi all,

 I have configured a virtual ip and redis in master-slave with corosync
 pacemaker. If redis fails, then the failover is successful, and redis gets
 promoted on the other node. However if pacemaker itself fails on the active
 node, the failover is not performed. Is there anything I missed in the
 configuration?

 Here's my configuration (i have hashed the ip address out):

 node host1.com

 node host2.com

 primitive ClusterIP IPaddr2 \

 params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \

 op monitor interval=1s timeout=20s \

 op start interval=0 timeout=20s \

 op stop interval=0 timeout=20s \

 meta is-managed=true target-role=Started resource-stickiness=500

 primitive redis redis \

 meta target-role=Master is-managed=true \

 op monitor interval=1s role=Master timeout=5s on-fail=restart

 ms redis_clone redis \

 meta notify=true is-managed=true ordered=false interleave=false
 globally-unique=false target-role=Master migration-threshold=1

 colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master

 colocation ip-on-redis inf: ClusterIP redis_clone:Master

 property cib-bootstrap-options: \

 dc-version=1.1.11-97629de \

 cluster-infrastructure=classic openais (with plugin) \

 expected-quorum-votes=2 \

 stonith-enabled=false

 property redis_replication: \

 redis_REPL_INFO=host.com


 thank you in advance


 Kind regards,


 Alex

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread Ken Gaillot
On 07/01/2015 08:57 AM, alex austin wrote:
 I have now configured stonith-enabled=true. What device should I use for
 fencing given the fact that it's a virtual machine but I don't have access
 to its configuration. would fence_pcmk do? if so, what parameters should I
 configure for it to work properly?

No, fence_pcmk is not for using in pacemaker, but for using in RHEL6's
CMAN to redirect its fencing requests to pacemaker.

For a virtual machine, ideally you'd use fence_virtd running on the
physical host, but I'm guessing from your comment that you can't do
that. Does whoever provides your VM also provide an API for controlling
it (starting/stopping/rebooting)?

Regarding your original problem, it sounds like the surviving node
doesn't have quorum. What version of corosync are you using? If you're
using corosync 2, you need two_node: 1 in corosync.conf, in addition
to configuring fencing in pacemaker.

 This is my new config:
 
 
 node dcwbpvmuas004.edc.nam.gm.com \
 
 attributes standby=off
 
 node dcwbpvmuas005.edc.nam.gm.com \
 
 attributes standby=off
 
 primitive ClusterIP IPaddr2 \
 
 params ip=198.208.86.242 cidr_netmask=23 \
 
 op monitor interval=1s timeout=20s \
 
 op start interval=0 timeout=20s \
 
 op stop interval=0 timeout=20s \
 
 meta is-managed=true target-role=Started resource-stickiness=500
 
 primitive pcmk-fencing stonith:fence_pcmk \
 
 params pcmk_host_list=dcwbpvmuas004.edc.nam.gm.com
 dcwbpvmuas005.edc.nam.gm.com \
 
 op monitor interval=10s \
 
 meta target-role=Started
 
 primitive redis redis \
 
 meta target-role=Master is-managed=true \
 
 op monitor interval=1s role=Master timeout=5s on-fail=restart
 
 ms redis_clone redis \
 
 meta notify=true is-managed=true ordered=false interleave=false
 globally-unique=false target-role=Master migration-threshold=1
 
 colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master
 
 colocation ip-on-redis inf: ClusterIP redis_clone:Master
 
 colocation pcmk-fencing-on-redis inf: pcmk-fencing redis_clone:Master
 
 property cib-bootstrap-options: \
 
 dc-version=1.1.11-97629de \
 
 cluster-infrastructure=classic openais (with plugin) \
 
 expected-quorum-votes=2 \
 
 stonith-enabled=true
 
 property redis_replication: \
 
 redis_REPL_INFO=dcwbpvmuas005.edc.nam.gm.com
 
 On Wed, Jul 1, 2015 at 2:53 PM, Nekrasov, Alexander 
 alexander.nekra...@emc.com wrote:
 
 stonith-enabled=false

 this might be the issue. The way peer node death is resolved, the
 surviving node must call STONITH on the peer. If it’s disabled it might not
 be able to resolve the event



 Alex



 *From:* alex austin [mailto:alexixa...@gmail.com]
 *Sent:* Wednesday, July 01, 2015 9:51 AM
 *To:* Users@clusterlabs.org
 *Subject:* Re: [ClusterLabs] Pacemaker failover failure



 So I noticed that if I kill redis on one node, it starts on the other, no
 problem, but if I actually kill pacemaker itself on one node, the other
 doesn't sense it so it doesn't fail over.







 On Wed, Jul 1, 2015 at 12:42 PM, alex austin alexixa...@gmail.com wrote:

 Hi all,



 I have configured a virtual ip and redis in master-slave with corosync
 pacemaker. If redis fails, then the failover is successful, and redis gets
 promoted on the other node. However if pacemaker itself fails on the active
 node, the failover is not performed. Is there anything I missed in the
 configuration?



 Here's my configuration (i have hashed the ip address out):



 node host1.com

 node host2.com

 primitive ClusterIP IPaddr2 \

 params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \

 op monitor interval=1s timeout=20s \

 op start interval=0 timeout=20s \

 op stop interval=0 timeout=20s \

 meta is-managed=true target-role=Started resource-stickiness=500

 primitive redis redis \

 meta target-role=Master is-managed=true \

 op monitor interval=1s role=Master timeout=5s on-fail=restart

 ms redis_clone redis \

 meta notify=true is-managed=true ordered=false interleave=false
 globally-unique=false target-role=Master migration-threshold=1

 colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master

 colocation ip-on-redis inf: ClusterIP redis_clone:Master

 property cib-bootstrap-options: \

 dc-version=1.1.11-97629de \

 cluster-infrastructure=classic openais (with plugin) \

 expected-quorum-votes=2 \

 stonith-enabled=false

 property redis_replication: \

 redis_REPL_INFO=host.com



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread alex austin
So I noticed that if I kill redis on one node, it starts on the other, no
problem, but if I actually kill pacemaker itself on one node, the other
doesn't sense it so it doesn't fail over.



On Wed, Jul 1, 2015 at 12:42 PM, alex austin alexixa...@gmail.com wrote:

 Hi all,

 I have configured a virtual ip and redis in master-slave with corosync
 pacemaker. If redis fails, then the failover is successful, and redis gets
 promoted on the other node. However if pacemaker itself fails on the active
 node, the failover is not performed. Is there anything I missed in the
 configuration?

 Here's my configuration (i have hashed the ip address out):

 node host1.com

 node host2.com

 primitive ClusterIP IPaddr2 \

 params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \

 op monitor interval=1s timeout=20s \

 op start interval=0 timeout=20s \

 op stop interval=0 timeout=20s \

 meta is-managed=true target-role=Started resource-stickiness=500

 primitive redis redis \

 meta target-role=Master is-managed=true \

 op monitor interval=1s role=Master timeout=5s on-fail=restart

 ms redis_clone redis \

 meta notify=true is-managed=true ordered=false interleave=false
 globally-unique=false target-role=Master migration-threshold=1

 colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master

 colocation ip-on-redis inf: ClusterIP redis_clone:Master

 property cib-bootstrap-options: \

 dc-version=1.1.11-97629de \

 cluster-infrastructure=classic openais (with plugin) \

 expected-quorum-votes=2 \

 stonith-enabled=false

 property redis_replication: \

 redis_REPL_INFO=host.com


 thank you in advance


 Kind regards,


 Alex

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread alex austin
I am running version 1.4.7 of corosync



On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot kgail...@redhat.com wrote:

 On 07/01/2015 08:57 AM, alex austin wrote:
  I have now configured stonith-enabled=true. What device should I use for
  fencing given the fact that it's a virtual machine but I don't have
 access
  to its configuration. would fence_pcmk do? if so, what parameters should
 I
  configure for it to work properly?

 No, fence_pcmk is not for using in pacemaker, but for using in RHEL6's
 CMAN to redirect its fencing requests to pacemaker.

 For a virtual machine, ideally you'd use fence_virtd running on the
 physical host, but I'm guessing from your comment that you can't do
 that. Does whoever provides your VM also provide an API for controlling
 it (starting/stopping/rebooting)?

 Regarding your original problem, it sounds like the surviving node
 doesn't have quorum. What version of corosync are you using? If you're
 using corosync 2, you need two_node: 1 in corosync.conf, in addition
 to configuring fencing in pacemaker.

  This is my new config:
 
 
  node dcwbpvmuas004.edc.nam.gm.com \
 
  attributes standby=off
 
  node dcwbpvmuas005.edc.nam.gm.com \
 
  attributes standby=off
 
  primitive ClusterIP IPaddr2 \
 
  params ip=198.208.86.242 cidr_netmask=23 \
 
  op monitor interval=1s timeout=20s \
 
  op start interval=0 timeout=20s \
 
  op stop interval=0 timeout=20s \
 
  meta is-managed=true target-role=Started resource-stickiness=500
 
  primitive pcmk-fencing stonith:fence_pcmk \
 
  params pcmk_host_list=dcwbpvmuas004.edc.nam.gm.com
  dcwbpvmuas005.edc.nam.gm.com \
 
  op monitor interval=10s \
 
  meta target-role=Started
 
  primitive redis redis \
 
  meta target-role=Master is-managed=true \
 
  op monitor interval=1s role=Master timeout=5s on-fail=restart
 
  ms redis_clone redis \
 
  meta notify=true is-managed=true ordered=false interleave=false
  globally-unique=false target-role=Master migration-threshold=1
 
  colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master
 
  colocation ip-on-redis inf: ClusterIP redis_clone:Master
 
  colocation pcmk-fencing-on-redis inf: pcmk-fencing redis_clone:Master
 
  property cib-bootstrap-options: \
 
  dc-version=1.1.11-97629de \
 
  cluster-infrastructure=classic openais (with plugin) \
 
  expected-quorum-votes=2 \
 
  stonith-enabled=true
 
  property redis_replication: \
 
  redis_REPL_INFO=dcwbpvmuas005.edc.nam.gm.com
 
  On Wed, Jul 1, 2015 at 2:53 PM, Nekrasov, Alexander 
  alexander.nekra...@emc.com wrote:
 
  stonith-enabled=false
 
  this might be the issue. The way peer node death is resolved, the
  surviving node must call STONITH on the peer. If it’s disabled it might
 not
  be able to resolve the event
 
 
 
  Alex
 
 
 
  *From:* alex austin [mailto:alexixa...@gmail.com]
  *Sent:* Wednesday, July 01, 2015 9:51 AM
  *To:* Users@clusterlabs.org
  *Subject:* Re: [ClusterLabs] Pacemaker failover failure
 
 
 
  So I noticed that if I kill redis on one node, it starts on the other,
 no
  problem, but if I actually kill pacemaker itself on one node, the other
  doesn't sense it so it doesn't fail over.
 
 
 
 
 
 
 
  On Wed, Jul 1, 2015 at 12:42 PM, alex austin alexixa...@gmail.com
 wrote:
 
  Hi all,
 
 
 
  I have configured a virtual ip and redis in master-slave with corosync
  pacemaker. If redis fails, then the failover is successful, and redis
 gets
  promoted on the other node. However if pacemaker itself fails on the
 active
  node, the failover is not performed. Is there anything I missed in the
  configuration?
 
 
 
  Here's my configuration (i have hashed the ip address out):
 
 
 
  node host1.com
 
  node host2.com
 
  primitive ClusterIP IPaddr2 \
 
  params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \
 
  op monitor interval=1s timeout=20s \
 
  op start interval=0 timeout=20s \
 
  op stop interval=0 timeout=20s \
 
  meta is-managed=true target-role=Started resource-stickiness=500
 
  primitive redis redis \
 
  meta target-role=Master is-managed=true \
 
  op monitor interval=1s role=Master timeout=5s on-fail=restart
 
  ms redis_clone redis \
 
  meta notify=true is-managed=true ordered=false interleave=false
  globally-unique=false target-role=Master migration-threshold=1
 
  colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master
 
  colocation ip-on-redis inf: ClusterIP redis_clone:Master
 
  property cib-bootstrap-options: \
 
  dc-version=1.1.11-97629de \
 
  cluster-infrastructure=classic openais (with plugin) \
 
  expected-quorum-votes=2 \
 
  stonith-enabled=false
 
  property redis_replication: \
 
  redis_REPL_INFO=host.com



 ___
 Users mailing list: Users@clusterlabs.org
 http://clusterlabs.org/mailman/listinfo/users

 Project Home: http://www.clusterlabs.org
 Getting started: 

Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread Ken Gaillot
On 07/01/2015 09:39 AM, alex austin wrote:
 This is what crm_mon shows
 
 
 Last updated: Wed Jul  1 10:35:40 2015
 
 Last change: Wed Jul  1 09:52:46 2015
 
 Stack: classic openais (with plugin)
 
 Current DC: host2 - partition with quorum
 
 Version: 1.1.11-97629de
 
 2 Nodes configured, 2 expected votes
 
 4 Resources configured
 
 
 
 Online: [ host1 host2 ]
 
 
 ClusterIP (ocf::heartbeat:IPaddr2): Started host2
 
  Master/Slave Set: redis_clone [redis]
 
  Masters: [ host2 ]
 
  Slaves: [ host1 ]
 
 pcmk-fencing(stonith:fence_pcmk):   Started host2
 
 On Wed, Jul 1, 2015 at 3:37 PM, alex austin alexixa...@gmail.com wrote:
 
 I am running version 1.4.7 of corosync

If you can't upgrade to corosync 2 (which has many improvements), you'll
need to set the no-quorum-policy=ignore cluster option.

Proper fencing is necessary to avoid a split-brain situation, which can
corrupt your data.

 On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot kgail...@redhat.com wrote:

 On 07/01/2015 08:57 AM, alex austin wrote:
 I have now configured stonith-enabled=true. What device should I use for
 fencing given the fact that it's a virtual machine but I don't have
 access
 to its configuration. would fence_pcmk do? if so, what parameters
 should I
 configure for it to work properly?

 No, fence_pcmk is not for using in pacemaker, but for using in RHEL6's
 CMAN to redirect its fencing requests to pacemaker.

 For a virtual machine, ideally you'd use fence_virtd running on the
 physical host, but I'm guessing from your comment that you can't do
 that. Does whoever provides your VM also provide an API for controlling
 it (starting/stopping/rebooting)?

 Regarding your original problem, it sounds like the surviving node
 doesn't have quorum. What version of corosync are you using? If you're
 using corosync 2, you need two_node: 1 in corosync.conf, in addition
 to configuring fencing in pacemaker.

 This is my new config:


 node dcwbpvmuas004.edc.nam.gm.com \

 attributes standby=off

 node dcwbpvmuas005.edc.nam.gm.com \

 attributes standby=off

 primitive ClusterIP IPaddr2 \

 params ip=198.208.86.242 cidr_netmask=23 \

 op monitor interval=1s timeout=20s \

 op start interval=0 timeout=20s \

 op stop interval=0 timeout=20s \

 meta is-managed=true target-role=Started resource-stickiness=500

 primitive pcmk-fencing stonith:fence_pcmk \

 params pcmk_host_list=dcwbpvmuas004.edc.nam.gm.com
 dcwbpvmuas005.edc.nam.gm.com \

 op monitor interval=10s \

 meta target-role=Started

 primitive redis redis \

 meta target-role=Master is-managed=true \

 op monitor interval=1s role=Master timeout=5s on-fail=restart

 ms redis_clone redis \

 meta notify=true is-managed=true ordered=false interleave=false
 globally-unique=false target-role=Master migration-threshold=1

 colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master

 colocation ip-on-redis inf: ClusterIP redis_clone:Master

 colocation pcmk-fencing-on-redis inf: pcmk-fencing redis_clone:Master

 property cib-bootstrap-options: \

 dc-version=1.1.11-97629de \

 cluster-infrastructure=classic openais (with plugin) \

 expected-quorum-votes=2 \

 stonith-enabled=true

 property redis_replication: \

 redis_REPL_INFO=dcwbpvmuas005.edc.nam.gm.com

 On Wed, Jul 1, 2015 at 2:53 PM, Nekrasov, Alexander 
 alexander.nekra...@emc.com wrote:

 stonith-enabled=false

 this might be the issue. The way peer node death is resolved, the
 surviving node must call STONITH on the peer. If it’s disabled it
 might not
 be able to resolve the event



 Alex



 *From:* alex austin [mailto:alexixa...@gmail.com]
 *Sent:* Wednesday, July 01, 2015 9:51 AM
 *To:* Users@clusterlabs.org
 *Subject:* Re: [ClusterLabs] Pacemaker failover failure



 So I noticed that if I kill redis on one node, it starts on the other,
 no
 problem, but if I actually kill pacemaker itself on one node, the other
 doesn't sense it so it doesn't fail over.







 On Wed, Jul 1, 2015 at 12:42 PM, alex austin alexixa...@gmail.com
 wrote:

 Hi all,



 I have configured a virtual ip and redis in master-slave with corosync
 pacemaker. If redis fails, then the failover is successful, and redis
 gets
 promoted on the other node. However if pacemaker itself fails on the
 active
 node, the failover is not performed. Is there anything I missed in the
 configuration?



 Here's my configuration (i have hashed the ip address out):



 node host1.com

 node host2.com

 primitive ClusterIP IPaddr2 \

 params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \

 op monitor interval=1s timeout=20s \

 op start interval=0 timeout=20s \

 op stop interval=0 timeout=20s \

 meta is-managed=true target-role=Started resource-stickiness=500

 primitive redis redis \

 meta target-role=Master is-managed=true \

 op monitor interval=1s role=Master timeout=5s on-fail=restart

 ms redis_clone redis \

 meta