Fencing is hardware dependent. Typically, fence_ipmilan if your nodes have IPMI and/or switched PDUs, like the APC AP7900 (which would use the fence_apc_snmp agent).
If you aren't sure, tell us what hardware you have. On 02/07/15 04:04 AM, alex austin wrote: > Thank you! > > However, what is proper fencing in this situation? > > Kind Regards, > > Alex > > On Wed, Jul 1, 2015 at 11:30 PM, Ken Gaillot <kgail...@redhat.com > <mailto:kgail...@redhat.com>> wrote: > > On 07/01/2015 09:39 AM, alex austin wrote: > > This is what crm_mon shows > > > > > > Last updated: Wed Jul 1 10:35:40 2015 > > > > Last change: Wed Jul 1 09:52:46 2015 > > > > Stack: classic openais (with plugin) > > > > Current DC: host2 - partition with quorum > > > > Version: 1.1.11-97629de > > > > 2 Nodes configured, 2 expected votes > > > > 4 Resources configured > > > > > > > > Online: [ host1 host2 ] > > > > > > ClusterIP (ocf::heartbeat:IPaddr2): Started host2 > > > > Master/Slave Set: redis_clone [redis] > > > > Masters: [ host2 ] > > > > Slaves: [ host1 ] > > > > pcmk-fencing (stonith:fence_pcmk): Started host2 > > > > On Wed, Jul 1, 2015 at 3:37 PM, alex austin <alexixa...@gmail.com > <mailto:alexixa...@gmail.com>> wrote: > > > >> I am running version 1.4.7 of corosync > > If you can't upgrade to corosync 2 (which has many improvements), you'll > need to set the no-quorum-policy=ignore cluster option. > > Proper fencing is necessary to avoid a split-brain situation, which can > corrupt your data. > > >> On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot <kgail...@redhat.com > <mailto:kgail...@redhat.com>> wrote: > >> > >>> On 07/01/2015 08:57 AM, alex austin wrote: > >>>> I have now configured stonith-enabled=true. What device should > I use for > >>>> fencing given the fact that it's a virtual machine but I don't have > >>> access > >>>> to its configuration. would fence_pcmk do? if so, what parameters > >>> should I > >>>> configure for it to work properly? > >>> > >>> No, fence_pcmk is not for using in pacemaker, but for using in > RHEL6's > >>> CMAN to redirect its fencing requests to pacemaker. > >>> > >>> For a virtual machine, ideally you'd use fence_virtd running on the > >>> physical host, but I'm guessing from your comment that you can't do > >>> that. Does whoever provides your VM also provide an API for > controlling > >>> it (starting/stopping/rebooting)? > >>> > >>> Regarding your original problem, it sounds like the surviving node > >>> doesn't have quorum. What version of corosync are you using? If > you're > >>> using corosync 2, you need "two_node: 1" in corosync.conf, in > addition > >>> to configuring fencing in pacemaker. > >>> > >>>> This is my new config: > >>>> > >>>> > >>>> node dcwbpvmuas004.edc.nam.gm.com > <http://dcwbpvmuas004.edc.nam.gm.com> \ > >>>> > >>>> attributes standby=off > >>>> > >>>> node dcwbpvmuas005.edc.nam.gm.com > <http://dcwbpvmuas005.edc.nam.gm.com> \ > >>>> > >>>> attributes standby=off > >>>> > >>>> primitive ClusterIP IPaddr2 \ > >>>> > >>>> params ip=198.208.86.242 cidr_netmask=23 \ > >>>> > >>>> op monitor interval=1s timeout=20s \ > >>>> > >>>> op start interval=0 timeout=20s \ > >>>> > >>>> op stop interval=0 timeout=20s \ > >>>> > >>>> meta is-managed=true target-role=Started > resource-stickiness=500 > >>>> > >>>> primitive pcmk-fencing stonith:fence_pcmk \ > >>>> > >>>> params pcmk_host_list="dcwbpvmuas004.edc.nam.gm.com > <http://dcwbpvmuas004.edc.nam.gm.com> > >>>> dcwbpvmuas005.edc.nam.gm.com > <http://dcwbpvmuas005.edc.nam.gm.com>" \ > >>>> > >>>> op monitor interval=10s \ > >>>> > >>>> meta target-role=Started > >>>> > >>>> primitive redis redis \ > >>>> > >>>> meta target-role=Master is-managed=true \ > >>>> > >>>> op monitor interval=1s role=Master timeout=5s > on-fail=restart > >>>> > >>>> ms redis_clone redis \ > >>>> > >>>> meta notify=true is-managed=true ordered=false > interleave=false > >>>> globally-unique=false target-role=Master migration-threshold=1 > >>>> > >>>> colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master > >>>> > >>>> colocation ip-on-redis inf: ClusterIP redis_clone:Master > >>>> > >>>> colocation pcmk-fencing-on-redis inf: pcmk-fencing > redis_clone:Master > >>>> > >>>> property cib-bootstrap-options: \ > >>>> > >>>> dc-version=1.1.11-97629de \ > >>>> > >>>> cluster-infrastructure="classic openais (with plugin)" \ > >>>> > >>>> expected-quorum-votes=2 \ > >>>> > >>>> stonith-enabled=true > >>>> > >>>> property redis_replication: \ > >>>> > >>>> redis_REPL_INFO=dcwbpvmuas005.edc.nam.gm.com > <http://dcwbpvmuas005.edc.nam.gm.com> > >>>> > >>>> On Wed, Jul 1, 2015 at 2:53 PM, Nekrasov, Alexander < > >>>> alexander.nekra...@emc.com <mailto:alexander.nekra...@emc.com>> > wrote: > >>>> > >>>>> stonith-enabled=false > >>>>> > >>>>> this might be the issue. The way peer node death is resolved, the > >>>>> surviving node must call STONITH on the peer. If it’s disabled it > >>> might not > >>>>> be able to resolve the event > >>>>> > >>>>> > >>>>> > >>>>> Alex > >>>>> > >>>>> > >>>>> > >>>>> *From:* alex austin [mailto:alexixa...@gmail.com > <mailto:alexixa...@gmail.com>] > >>>>> *Sent:* Wednesday, July 01, 2015 9:51 AM > >>>>> *To:* Users@clusterlabs.org <mailto:Users@clusterlabs.org> > >>>>> *Subject:* Re: [ClusterLabs] Pacemaker failover failure > >>>>> > >>>>> > >>>>> > >>>>> So I noticed that if I kill redis on one node, it starts on > the other, > >>> no > >>>>> problem, but if I actually kill pacemaker itself on one node, > the other > >>>>> doesn't "sense" it so it doesn't fail over. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Wed, Jul 1, 2015 at 12:42 PM, alex austin > <alexixa...@gmail.com <mailto:alexixa...@gmail.com>> > >>> wrote: > >>>>> > >>>>> Hi all, > >>>>> > >>>>> > >>>>> > >>>>> I have configured a virtual ip and redis in master-slave with > corosync > >>>>> pacemaker. If redis fails, then the failover is successful, > and redis > >>> gets > >>>>> promoted on the other node. However if pacemaker itself fails > on the > >>> active > >>>>> node, the failover is not performed. Is there anything I > missed in the > >>>>> configuration? > >>>>> > >>>>> > >>>>> > >>>>> Here's my configuration (i have hashed the ip address out): > >>>>> > >>>>> > >>>>> > >>>>> node host1.com <http://host1.com> > >>>>> > >>>>> node host2.com <http://host2.com> > >>>>> > >>>>> primitive ClusterIP IPaddr2 \ > >>>>> > >>>>> params ip=xxx.xxx.xxx.xxx cidr_netmask=23 \ > >>>>> > >>>>> op monitor interval=1s timeout=20s \ > >>>>> > >>>>> op start interval=0 timeout=20s \ > >>>>> > >>>>> op stop interval=0 timeout=20s \ > >>>>> > >>>>> meta is-managed=true target-role=Started resource-stickiness=500 > >>>>> > >>>>> primitive redis redis \ > >>>>> > >>>>> meta target-role=Master is-managed=true \ > >>>>> > >>>>> op monitor interval=1s role=Master timeout=5s on-fail=restart > >>>>> > >>>>> ms redis_clone redis \ > >>>>> > >>>>> meta notify=true is-managed=true ordered=false interleave=false > >>>>> globally-unique=false target-role=Master migration-threshold=1 > >>>>> > >>>>> colocation ClusterIP-on-redis inf: ClusterIP redis_clone:Master > >>>>> > >>>>> colocation ip-on-redis inf: ClusterIP redis_clone:Master > >>>>> > >>>>> property cib-bootstrap-options: \ > >>>>> > >>>>> dc-version=1.1.11-97629de \ > >>>>> > >>>>> cluster-infrastructure="classic openais (with plugin)" \ > >>>>> > >>>>> expected-quorum-votes=2 \ > >>>>> > >>>>> stonith-enabled=false > >>>>> > >>>>> property redis_replication: \ > >>>>> > >>>>> redis_REPL_INFO=host.com <http://host.com> > > > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org