Assume you have a 30 node cluster distributed across three AZ’s with an RF of 
3. Trying to come up with a runbook to manage multi-nodes failure as a result 
of …

- Loss of an entire AZ1
- Loss of multiple nodes in AZ2
- AZ3 unaffected. No node loss

Is this is most optimal plan. Replacing dead nodes via bootstrapping  …

1. Replace seeds nodes first (via bootstrap)
2. Bootstrap the few nodes in AZ2
3. Bootstrap all nodes in AZ1
4. Run a cluster repair.

Do you wait to bootstrap everything before running repair or do you repair per 
node?
Did I miss anything? 

----------------
Thank you

Reply via email to