Hey Austin, It sounds like you are asking about read availability in the case where a primary cluster becomes unhealthy?
In that case, you should look at the HBase on S3 Read Replica clusters feature[1][2]. This allows for High availability reads if the primary cluster becomes unhealthy. Let me know if I misinterpreted your ask! Thanks, Zach [1] https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hbase-s3.html#emr-hbase-s3-read-replica [2] https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/ > ---------- Forwarded message --------- >> From: Austin Heyne <ahe...@ccri.com> >> Date: Thu, Aug 30, 2018 at 8:30 AM >> Subject: HA master on EMR >> To: <user@hbase.apache.org> >> >> >> HBase on EMR is fairly reliable but is still subject to hardware >> failures (which has happened to me before). Is there a best practice for >> adding backup masters to an EMR cluster? >> >> I know this isn't technically a supported feature from AWS but we're >> already heavily invested into HBase on EMR and would like to investigate >> options on mitigating the risk of a master failure. In EMR if the master >> dies the entire cluster is terminated so we need fail over for HBase, >> Hadoop/HDFS and Zookeeper. The one idea that I've had is to create a >> second (or third) EMR cluster with its HBase, Zookeeper and Hadoop/HDFS >> configuration pointed to the primary cluster. This would in effect add >> the RegionServers and Datanodes to the primary cluster. I know that >> loosing 1/3 to 1/2 of your Datanodes would most likely mean you would >> loose some WALs but re-ingesting the last days worth of data is >> acceptable trade off for us in exchange for not having downtime. >> >> I realize this is a slightly crazy idea and using something like >> Kubernetes is the 'correct' solution but I have to work with what we >> have and mitigate possible issues. My question is are there any big >> issues that anyone would foresee us having with this idea? >> >> Thanks for the feedback, >> Austin >> >>