On Thu, Jan 14, 2010 at 11:44 PM, Miki Shapiro <miki.shap...@coles.com.au> wrote: > Confused. > > > > I *am* running DRBD in dual-master mode
/me cringes... this sounds to me like an impossibly dangerous idea. Can someone from linbit comment on this please? Am I imagining this? > (apologies, I should have mentioned > that earlier), and there will be both WAN clients as well as > local-to-datacenter-clients writing to both nodes on both ends. It’s safe to > assume the clients will know not of the split. > > > > In a WAN split I need to ensure that the node whose idea of drbd volume will > be kept once resync happens stays up, and node that’ll get blown away and > re-synced/overwritten becomes dead asap. Won't you _always_ loose some data in a WAN split though? AFAICS, you're doing here is preventing "some" being "lots". Is master/master really a requirement? > NodeX(Successfully) taking on data from clients while in > quorumless-freeze-still-providing-service, then discarding its hitherto > collected client data when realizing other node has quorum and discarding > own data isn’t good. Agreed - freeze isn't an option if you're doing master/master. > > To recap what I understood so far: > > 1. CRM Availability on the multicast channel drives DC election, but > DC election is irrelevant to us here. > > 2. CRM Availability on the multicast channel (rather than resource > failure) drive who-is-in-quorum-and-who-is-not decisions [not sure here.. > correct? correct > Or does resource failure drive quorum? ] quorum applies to node availability - resource failures have no impact (unless they lead to fencing with then leads to the node leaving the membership) > > 3. Steve to clarify what happens quorum-wise if 1/3 nodes sees both > others, but the other two only see the first (“broken triangle”), and > whether this behaviour may differ based on whether the first node (which is > different as it sees both others) happens to be the DC at the time or not. Try in a cluster of 3 VMs? Just use iptables rules to simulate the broken links > > Given that anyone who goes about building a production cluster would want to > identify all likely failure modes and be able to predict how the cluster > behaves in each one, is there any user-targeted doco/rtfm material one could > read regarding how quorum establishment works in such scenarios? I don't think corosync has such a doc at the moment. > Setting up a 3-way with intermittent WAN links without getting a clear > understanding in advance of how the software will behave is … scary J _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker