On Tue, 23 Nov 2010 12:28:41 +0000, Colin Simpson wrote: > Since the third node is ignorant about the status of DRBD, I don't > really see what help it gives for it to decide on quorum.
I've just read through the "Best Practice with DRBD RHCS and GFS2" thread on the drbd-users list. And I'm still missing what seems to me to be a fundamental issue. First: It seems like you no longer (since 8.3.8) need to have GFS startup await the DRBD sync operation. That's good, but is this because DRBD does the proper thing with I/O requests during a sync? That's what I think is so, but then I don't understand why you'd an issue with 8.2. Or am I missing something? But the real issue for me is quorum/consensus. I noted: startup { wfc-timeout 0 ; # Wait forever for initial connection degr-wfc-timeout 60; # Wait only 60 seconds if this node # was a degraded cluster } and net { allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } but when I break the DRBD connection between two primary nodes, "disconnected" apparently means that the nodes both continue as if they've UpToDate disks. But this lets the data go out of sync. Isn't this a Bad Thing? Clearly, if there were some third party (ie. a quorum disk or a third node), this could be resolved. But these don't seem to be required in the DRBD world, so how is this situation resolved? DRBD supports fencing, so perhaps that is the answer? I'm reluctant to make use of the cluster's fencing as - as described in the thread you referenced - cluster suite starts after DRBD. I'm thinking of trying a fencing policy of resource-and-stonith where the the handler tries to get a shared semaphore (ie. connect to a port on a third server that accepts only a single connection at a time, or perhaps even just a lock on an file mounted via NFS from a third server). If it raises the semaphore/gets the lock, it fences the DRBD peer. If it doesn't, it either waits forever or marks itself as outdated. This may also work to solve the startup "wait forever" problem, in that the starting node in WaitForConnect which gets the shared lock first gets to come up while the other is blocked. I'm not yet sure how to implement this from DRBD's perspective, though. I'm not clear that there's a handler that's called if DRBD starts and cannot establish an initial connection. That I've found no mention of this idea leaves me suspicious that it won't work or that it's overkill. Yet I cannot see why. It follows the same model of quorum as the cluster software. Thanks... Andrew -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster