Konstantin Shvachko wrote:
And the only remaining step is to implement fail-over mechanism.
:)
Colleagues of mine work on HA stuff; I try and steer clear of it as it
gets complex fast. Test case: what happens when a network failure
splits the datacentre in two, you now have two clusters each with half
the data and possibly a primary/2ary master in each one. Then leave the
partition up for a while, do inconsistent operations on each then have
the network come back up. Then work out how to merge the state
Looking at the facebook/google "multi-master" solution, I think they
don't worry about consistency, just let the masters drift apart.
see also Johan's recent talk on HDFS: http://www.slideshare.net/steve_l/hdfs