If you don’t mind, please allow me to walk through my architecture just a bit. 
I know that I am far from an expert on this stuff, but I feel like I have a 
firm grasp on how this all works conceptually. That said, I welcome your 
insights and advice on how to approach this problem—and any ready-made 
solutions to it you might have on hand. :)

We are deploying into two availability zones (AZs) in AWS. Our goal is to be 
able to absorb the loss of an entire AZ and continue to provide services to 
users. Our first problem comes with Microsoft SQL Server running on Windows 
Server Failover Clustering. As you guys likely know, WSFC isn’t polite about 
staying up without a quorum. As such, I figured, oh, hey, I can build a 
two-node Pacemaker-based iSCSI Target cluster and expose luns from it via iSCSI 
so that the WSFC nodes could have a witness filesystem.

So, I’ve managed to make that all happen. Yay! What I’m now trying to suss out 
is how I ensure that I’m covered for availability in the event of any kind of 
outage. As it turns out, I believe that I actually have most of the covered.

(I use “1o” to indicate “primary” and “2o” for “secondary”)

Planned Outage: effected node gracefully demoted from cluster, life goes on, 
everyone is happy
Unplanned Outage (NAS cluster node fails/unreachable): if 2o node failure, 
nothing happens; if 1o node failure, drbd promotes 2o to 1o, constrained vip, 
lvm, tgt, and lun resources automatically flip to 2o node, life goes on, 
everyone is happy

But still—the one thing we built this ridiculously complicated and 
overengineered thing for—I don’t feel like I have a good story when it comes to 
a severed AZ event (loss of perimeter communications, etc.)

Unplanned Outage (AZ connectivity severed): both nodes detect that the other 
node is gone so promote themselves to primary. The unsevered side would 
continue to work as expected, with the witness mounted by the SQL servers in 
that AZ, life goes on and at least the USERS are happy… but the severed side is 
also sojourning on. Both sides of the SQL cluster would think they have quorum 
even if they can’t talk to their peer nodes, so they mark their peers as down 
and keep on keeping on. No users would be connecting to the severed instances, 
but background and system tasks would proceed as normal, potentially writing 
new data to the databases making rejoining the nodes to the cluster a little 
bit tricky to say the least, especially if the severed side’s network comes 
back up and both systems come to realize that they’re not consistent.

So, my problem, I think, is two-fold:


1.       What can I monitor from each of the NAS cluster instances (besides 
connectivity to one another) that would “ALWAYS” be available when things are 
working and NEVER available when they are broken?  Seems to me that if I can 
sort out something that meets these criteria (I was thinking, perhaps, a 
RESTful connection to the AWS API, but I’m not entirely sure you can’t get 
responses at API endpoints that may or may not be hosted inside the AZ), then I 
could write a simple monitoring script that runs on both nodes that would act 
as a fencing and STONITH solution (if detect bad things, then shut down). Seems 
to me that this would prevent the data inconsistency since the severed side 
WSFC would lose its witness file system, thus its quorum, and take itself 
offline.

2.       Have I failed to account for another failure condition that could be 
potentially as/more harmful than anything I’ve thought of already?

Anyway, I’m hopeful, someone(s) here can share some of their own experiences 
from the trenches. Thank you for your time (and all the help you guys have been 
in getting this set up already).

--

[ jR ]

  @: ja...@eramsey.org<mailto:ja...@eramsey.org>

  there is no path to greatness; greatness is the path
_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to