On Sat, 25 Oct 2014 17:30:07 -0400 Digimer <li...@alteeve.ca> wrote: > On 25/10/14 05:09 PM, Vladimir wrote: > > Hi, > > > > currently I'm testing a 2 node setup using ubuntu trusty. > > > > # The scenario: > > > > All communication links betwenn the 2 nodes are cut off. This > > results in a split brain situation and both nodes take their > > resources online. > > > > When the communication links get back, I see following behaviour: > > > > On drbd level the split brain is detected and the device is > > disconnected on both nodes because of "after-sb-2pri disconnect" and > > then it goes to StandAlone ConnectionState. > > > > I'm wondering why pacemaker does not let the resources fail. > > It is still possible to migrate resources between the nodes although > > they're in StandAlone ConnectionState. After a split brain that's > > not what I want. > > > > Is this the expected behaviour? Is it possible to let the resources > > fail after the network recovery to avoid fürther data corruption. > > > > (At the moment I can't use resource or node level fencing in my > > setup.) > > > > Here the main part of my config: > > > > #> dpkg -l | awk '$2 ~ /^(pacem|coro|drbd|libqb)/{print $2,$3}' > > corosync 2.3.3-1ubuntu1 > > drbd8-utils 2:8.4.4-1ubuntu1 > > libqb-dev 0.16.0.real-1ubuntu3 > > libqb0 0.16.0.real-1ubuntu3 > > pacemaker 1.1.10+git20130802-1ubuntu2.1 > > pacemaker-cli-utils 1.1.10+git20130802-1ubuntu2.1 > > > > # pacemaker > > primitive drbd-mysql ocf:linbit:drbd \ > > params drbd_resource="mysql" \ > > op monitor interval="29s" role="Master" \ > > op monitor interval="30s" role="Slave" > > > > ms ms-drbd-mysql drbd-mysql \ > > meta master-max="1" master-node-max="1" clone-max="2" > > clone-node-max="1" notify="true" > > Split-brains are prevented by using reliable fencing (aka stonith). > You configure stonith in pacemaker (using IPMI/iRMC/iLO/etc, switched > PDUs, etc). Then you configure DRBD to use the crm-fence-peer.sh > fence-handler and you set the fencing policy to > 'resource-and-stonith;'. > > This way, if all links fail, both nodes block and call a fence. The > faster one fences (powers off) the slower, and then it begins > recovery, assured that the peer is not doing the same. > > Without stonith/fencing, then there is no defined behaviour. You will > get split-brains and that is that. Consider; Both nodes lose contact > with it's peer. Without fencing, both must assume the peer is dead > and thus take over resources.
That split brains can occur in such a setup that's clear. But I would expect pacemaker to stop the drbd resource when the link between the cluster nodes is reestablished instead of continue running it. > This is why stonith is required in clusters. Even with quorum, you > can't assume anything about the state of the peer until it is fenced, > so it would only give you a false sense of security. Maybe I'll can use resource level fencing. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org