On 16/04/17 01:53 PM, Eric Robinson wrote: > I was reading in "Clusters from Scratch" where Beekhof states, "Some would > argue that two-node clusters are always pointless, but that is an argument > for another time." Is there a page or thread where this argument has been > fleshed out? Most of my dozen clusters are 2 nodes. I hate to think they're > pointless. > > -- > Eric Robinson
There is a belief that you can't build a reliable cluster without quorum. I am of the mind that you *can* build a very reliable 2-node cluster. In fact, every cluster our company has deployed, going back over five years, has been 2-node and have had exception uptimes. The confusion comes from the belief that quorum is required and stonith is option. The reality is the opposite. I'll come back to this in a minute. In a two-node cluster, you have two concerns; 1. If communication between the nodes fail, but both nodes are alive, how do you avoid a split brain? 2. If you have a two node cluster and enable cluster startup on boot, how do you avoid a fence loop? Many answer #1 by saying "you need a quorum node to break the tie". In some cases, this works, but only when all nodes are behaving in a predictable manner. Many answer #2 by saying "well, with three nodes, if a node boots and can't talk to either other node, it is inquorate and won't do anything". This is a valid mechanism, but it is not the only one. So let me answer these from a 2-node perspective; 1. You use stonith and the faster node lives, the slower node dies. From the moment of comms failure, the cluster blocks (needed with quorum, too) and doesn't restore operation until the (slower) peer is in a known state; Off. You can bias this by setting a fence delay against your preferred node. So say node 1 is the node that normally hosts your services, then you add 'delay="15"' to node 1's fence method. This tells node 2 to wait 15 seconds before fencing node 1. If both nodes are alive, node 2 will be fenced before the timer expires. 2. In Corosync v2+, there is a 'wait_for_all' option that tells a node to not do anything until it is able to talk to the peer node. So in the case of a fence after a comms break, the node that reboots will come up, fail to reach the survivor node and do nothing more. Perfect. Now let me come back to quorum vs. stonith; Said simply; Quorum is a tool for when everything is working. Fencing is a tool for when things go wrong. Lets assume that your cluster is working find, then for whatever reason, node 1 hangs hard. At the time of the freeze, it was hosting a virtual IP and an NFS service. Node 2 declares node 1 lost after a period of time and decides it needs to take over; In the 3-node scenario, without stonith, node 2 reforms a cluster with node 3 (quorum node), decides that it is quorate, starts its NFS server and takes over the virtual IP. So far, so good... Until node 1 comes out of its hang. At that moment, node 1 has no idea time has passed. It has no reason to think "am I still quorate? Are my locks still valid?" It just finishes whatever it was in the middle of doing and bam, split-brain. At the least, you have two nodes claiming the same IP at the same time. At worse, you had uncoordinated writes to shared storage and you've corrupted your data. In the 2-node scenario, with stonith, node 2 is always quorate, so after declaring node 1 lost, it moves to fence node 1. Once node 1 is fenced, *then* it starts NFS, takes over the virtual IP and restores services. In this case, no split-brain is possible because node 1 has rebooted and comes up with a fresh state (or it's on fire and never coming back anyway). This is why quorum is optional and stonith/fencing is not. Now, with this said, I won't say that 3+ node clusters are bad. They're fine if they suit your use-case, but even with 3+ nodes you still must use stonith. My *personal* arguments in favour of 2-node clusters over 3+ nodes is this; A cluster is not beautiful when there is nothing left to add. It is beautiful when there is nothing left to take away. In availability clustering, nothing should ever be more important than availability, and availability is a product of simplicity. So in my view, a 3-node cluster adds complexity that is avoidable, and so is sub-optimal. I'm happy to answer any questions you have on my comments/point of view on this. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org