Hello, I have been wanting to write a proposal on using Apache Helix for cluster management in HBase but wanted to hear some thoughts/feedback on whether such an exercise would be useful.
I understand that Hbase has built its own cluster management solution but integrating with Helix can provide additional benefits such as - Multiple replicas per region. Assigning roles to each replica such as primary/secondary. - Handle fail over - promote secondary to master. - Expansion- Redistributes the load when new nodes are added. In terms of architecture, Helix fits well with HBase architecture. Helix controller is similar to HBase Master. Similar to Hbase it uses Zookeeper to store the cluster state. We have built in many optimizations to make the fail over fast and reliable. The core philosophy behind Helix as been to abstract out cluster management from core functionality and treat cluster management as first class citizen. This allows systems to benefit from features developed at Helix such as flapping detection, non trivial fault detection, chaos monkey, throttling movement of data etc. Apache Helix is currently used to power the core back end infrastructure (data store, search & pub/sub) components at LinkedIn and is in production for more than a year managing more than 1k machines. Appreciate feedback/thoughts on this topic. thanks, Kishore G Additional info: Helix - http://helix.incubator.apache.org SOCC Paper <http://www.slideshare.net/KishoreGopalakrishna/helix-onecol> Reading material: Systems that use Helix Espresso <http://www.slideshare.net/amywtang/espresso-20952131> Databus<https://915bbc94-a-62cb3a1a-s-sites.googlegroups.com/site/acm2012socc/s18-das.pdf>