On 07/01/2015 09:39 AM, alex austin wrote: > This is what crm_mon shows > > > Last updated: Wed Jul 1 10:35:40 2015 > > Last change: Wed Jul 1 09:52:46 2015 > > Stack: classic openais (with plugin) > > Current DC: host2 - partition with quorum > > Version: 1.1.11-97629de > > 2 Nodes configured, 2 expected votes > > 4 Resources configured > > > > Online: [ host1 host2 ] > > > ClusterIP (ocf::heartbeat:IPaddr2): Started host2 > > Master/Slave Set: redis_clone [redis] > > Masters: [ host2 ] > > Slaves: [ host1 ] > > pcmk-fencing (stonith:fence_pcmk): Started host2 > > On Wed, Jul 1, 2015 at 3:37 PM, alex austin <alexixa...@gmail.com> wrote: > >> I am running version 1.4.7 of corosync
If you can't upgrade to corosync 2 (which has many improvements), you'll need to set the no-quorum-policy=ignore cluster option. Proper fencing is necessary to avoid a split-brain situation, which can corrupt your data. >> On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot <kgail...@redhat.com> wrote: >> >>> On 07/01/2015 08:57 AM, alex austin wrote: >>>> I have now configured stonith-enabled=true. What device should I use for >>>> fencing given the fact that it's a virtual machine but I don't have >>> access >>>> to its configuration. would fence_pcmk do? if so, what parameters >>> should I >>>> configure for it to work properly? >>> >>> No, fence_pcmk is not for using in pacemaker, but for using in RHEL6's >>> CMAN to redirect its fencing requests to pacemaker. >>> >>> For a virtual machine, ideally you'd use fence_virtd running on the >>> physical host, but I'm guessing from your comment that you can't do >>> that. Does whoever provides your VM also provide an API for controlling >>> it (starting/stopping/rebooting)? >>> >>> Regarding your original problem, it sounds like the surviving node >>> doesn't have quorum. What version of corosync are you using? If you're >>> using corosync 2, you need "two_node: 1" in corosync.conf, in addition >>> to configuring fencing in pacemaker. >>> >>>> This is my new config: >>>> >>>> >>>> node dcwbpvmuas004.edc.nam.gm.com \ >>>> >>>> attributes standby=off >>>> >>>> node dcwbpvmuas005.edc.nam.gm.com \ >>>> >>>> attributes standby=off >>>> >>>> primitive ClusterIP IPaddr2 \ >>>> >>>> params ip=198.208.86.242 cidr_netmask=23 \ >>>> >>>> op monitor interval=1s timeout=20s \ >>>> >>>> op start interval=0 timeout=20s \ >>>> >>>> op stop interval=0 timeout=20s \ >>>> >>>> meta is-managed=true target-role=Started resource-stickiness=500 >>>> >>>> primitive pcmk-fencing stonith:fence_pcmk \ >>>> >>>> params pcmk_host_list="dcwbpvmuas004.edc.nam.gm.com >>>> dcwbpvmuas005.edc.nam.gm.com" \ >>>> >>>> op monitor interval=10s \ >>>> >>>> meta target-role=Started >>>> >>>> primitive redis redis \ >>>> >>>> meta target-role=Master is-managed=true \ >>>> >>>> op monitor interval=1s role=Master timeout=5s on-fail=restart >>>> >>>> ms redis_clone redis \ >>>> >>>> meta notify=true is-managed=true ordered=false interleave=false >>>> globally-unique=false target-role=Master migration-threshold=1 >>>> >>>> colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master >>>> >>>> colocation ip-on-redis inf: ClusterIP redis_clone:Master >>>> >>>> colocation pcmk-fencing-on-redis inf: pcmk-fencing redis_clone:Master >>>> >>>> property cib-bootstrap-options: \ >>>> >>>> dc-version=1.1.11-97629de \ >>>> >>>> cluster-infrastructure="classic openais (with plugin)" \ >>>> >>>> expected-quorum-votes=2 \ >>>> >>>> stonith-enabled=true >>>> >>>> property redis_replication: \ >>>> >>>> redis_REPL_INFO=dcwbpvmuas005.edc.nam.gm.com >>>> >>>> On Wed, Jul 1, 2015 at 2:53 PM, Nekrasov, Alexander < >>>> alexander.nekra...@emc.com> wrote: >>>> >>>>> stonith-enabled=false >>>>> >>>>> this might be the issue. The way peer node death is resolved, the >>>>> surviving node must call STONITH on the peer. If it’s disabled it >>> might not >>>>> be able to resolve the event >>>>> >>>>> >>>>> >>>>> Alex >>>>> >>>>> >>>>> >>>>> *From:* alex austin [mailto:alexixa...@gmail.com] >>>>> *Sent:* Wednesday, July 01, 2015 9:51 AM >>>>> *To:* Users@clusterlabs.org >>>>> *Subject:* Re: [ClusterLabs] Pacemaker failover failure >>>>> >>>>> >>>>> >>>>> So I noticed that if I kill redis on one node, it starts on the other, >>> no >>>>> problem, but if I actually kill pacemaker itself on one node, the other >>>>> doesn't "sense" it so it doesn't fail over. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Jul 1, 2015 at 12:42 PM, alex austin <alexixa...@gmail.com> >>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> >>>>> >>>>> I have configured a virtual ip and redis in master-slave with corosync >>>>> pacemaker. If redis fails, then the failover is successful, and redis >>> gets >>>>> promoted on the other node. However if pacemaker itself fails on the >>> active >>>>> node, the failover is not performed. Is there anything I missed in the >>>>> configuration? >>>>> >>>>> >>>>> >>>>> Here's my configuration (i have hashed the ip address out): >>>>> >>>>> >>>>> >>>>> node host1.com >>>>> >>>>> node host2.com >>>>> >>>>> primitive ClusterIP IPaddr2 \ >>>>> >>>>> params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \ >>>>> >>>>> op monitor interval=1s timeout=20s \ >>>>> >>>>> op start interval=0 timeout=20s \ >>>>> >>>>> op stop interval=0 timeout=20s \ >>>>> >>>>> meta is-managed=true target-role=Started resource-stickiness=500 >>>>> >>>>> primitive redis redis \ >>>>> >>>>> meta target-role=Master is-managed=true \ >>>>> >>>>> op monitor interval=1s role=Master timeout=5s on-fail=restart >>>>> >>>>> ms redis_clone redis \ >>>>> >>>>> meta notify=true is-managed=true ordered=false interleave=false >>>>> globally-unique=false target-role=Master migration-threshold=1 >>>>> >>>>> colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master >>>>> >>>>> colocation ip-on-redis inf: ClusterIP redis_clone:Master >>>>> >>>>> property cib-bootstrap-options: \ >>>>> >>>>> dc-version=1.1.11-97629de \ >>>>> >>>>> cluster-infrastructure="classic openais (with plugin)" \ >>>>> >>>>> expected-quorum-votes=2 \ >>>>> >>>>> stonith-enabled=false >>>>> >>>>> property redis_replication: \ >>>>> >>>>> redis_REPL_INFO=host.com _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org