> no-quorum-policy: ignore > stonith-enabled: false You must have fencing configured.
CentOS 6 uses pacemaker with the cman plugin. So setup cman (cluster.conf) to use the fence_pcmk passthrough agent, then setup proper stonith in pacemaker (and test that it works). Finally, tell DRBD to use 'fencing resource-and-stonith;' and configure the 'crm-{un,}fence-peer.sh' {un,}fence handlers. See if that gets things working. On 07/09/16 04:04 AM, Devin Ortner wrote: > I have a 2-node cluster running CentOS 6.8 and Pacemaker with DRBD. I have > been using the "Clusters from Scratch" documentation to create my cluster and > I am running into a problem where DRBD is not failing over to the other node > when one goes down. Here is my "pcs status" prior to when it is supposed to > fail over: > > ---------------------------------------------------------------------------------------------------------------------- > > [root@node1 ~]# pcs status > Cluster name: webcluster > Last updated: Tue Sep 6 14:50:21 2016 Last change: Tue Sep 6 > 14:50:17 2016 by root via crm_attribute on node1 > Stack: cman > Current DC: node2 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum > 2 nodes and 5 resources configured > > Online: [ node1 node2 ] > > Full list of resources: > > Cluster_VIP (ocf::heartbeat:IPaddr2): Started node1 > Master/Slave Set: ClusterDBclone [ClusterDB] > Masters: [ node1 ] > Slaves: [ node2 ] > ClusterFS (ocf::heartbeat:Filesystem): Started node1 > WebSite (ocf::heartbeat:apache): Started node1 > > Failed Actions: > * ClusterFS_start_0 on node2 'unknown error' (1): call=61, status=complete, > exitreason='none', > last-rc-change='Tue Sep 6 13:15:00 2016', queued=0ms, exec=40ms > > > PCSD Status: > node1: Online > node2: Online > > [root@node1 ~]# > > When I put node1 in standby everything fails over except DRBD: > -------------------------------------------------------------------------------------- > > [root@node1 ~]# pcs cluster standby node1 > [root@node1 ~]# pcs status > Cluster name: webcluster > Last updated: Tue Sep 6 14:53:45 2016 Last change: Tue Sep 6 > 14:53:37 2016 by root via cibadmin on node2 > Stack: cman > Current DC: node2 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum > 2 nodes and 5 resources configured > > Node node1: standby > Online: [ node2 ] > > Full list of resources: > > Cluster_VIP (ocf::heartbeat:IPaddr2): Started node2 > Master/Slave Set: ClusterDBclone [ClusterDB] > Slaves: [ node2 ] > Stopped: [ node1 ] > ClusterFS (ocf::heartbeat:Filesystem): Stopped > WebSite (ocf::heartbeat:apache): Started node2 > > Failed Actions: > * ClusterFS_start_0 on node2 'unknown error' (1): call=61, status=complete, > exitreason='none', > last-rc-change='Tue Sep 6 13:15:00 2016', queued=0ms, exec=40ms > > > PCSD Status: > node1: Online > node2: Online > > [root@node1 ~]# > > I have pasted the contents of "/var/log/messages" here: > http://pastebin.com/0i0FMzGZ > Here is my Configuration: http://pastebin.com/HqqBV90p > > When I unstandby node1, it comes back as the master for the DRBD and > everything else stays running on node2 (Which is fine because I haven't setup > colocation constraints for that) > Here is what I have after node1 is back: > ----------------------------------------------------- > > [root@node1 ~]# pcs cluster unstandby node1 > [root@node1 ~]# pcs status > Cluster name: webcluster > Last updated: Tue Sep 6 14:57:46 2016 Last change: Tue Sep 6 > 14:57:42 2016 by root via cibadmin on node1 > Stack: cman > Current DC: node2 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum > 2 nodes and 5 resources configured > > Online: [ node1 node2 ] > > Full list of resources: > > Cluster_VIP (ocf::heartbeat:IPaddr2): Started node2 > Master/Slave Set: ClusterDBclone [ClusterDB] > Masters: [ node1 ] > Slaves: [ node2 ] > ClusterFS (ocf::heartbeat:Filesystem): Started node1 > WebSite (ocf::heartbeat:apache): Started node2 > > Failed Actions: > * ClusterFS_start_0 on node2 'unknown error' (1): call=61, status=complete, > exitreason='none', > last-rc-change='Tue Sep 6 13:15:00 2016', queued=0ms, exec=40ms > > > PCSD Status: > node1: Online > node2: Online > > [root@node1 ~]# > > Any help would be appreciated, I think there is something dumb that I'm > missing. > > Thank you. > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org