Hi, We're attempting to build a multi-site cluster:
1. web-tier and application tier is active in all sites 2. only one database is active at a time- normally in the designated "primary" site We want to use 3 sites to maintain a quorum. So, if the Primary site loses sight of both of the other sites, it will close down itself down. If the other sites both lose sight of the Primary site, they will co-operate in electing one of the pair as the new primary, and bring up the database services. I am thinking that in each site, a number of sentinel processes could hold open ephemeral znodes flagging that the site is up - with names like "site1/sentinel-1". These sentinels could be plugged into local health monitoring, and when the site falls into dis-repair, remove themselves. If links between sites fail, then the ephemeral nodes would disappear too. Each site would have a process that periodically checks the presence of the sentinel znodes of the other sites. If all disappear, then the site knows it is in a minority partition, and shuts down services as required. Is this a viable approach, or am I taking Zookeeper out of its application domain and just asking for trouble ? regards, Martin