On Fri, May 11, 2012 at 3:24 AM, Mikael Sitruk <mikael.sit...@gmail.com> wrote: > Sorry for not being precise enough. > The point is that i'm trying to check the impact of HA scenarios. one of > them is when the master goes down. > That is true that the Master is not it the critical path of read/write > unless (please correct me if i'm wrong): > 1. new client are trying to connect
Clients don't go to the master, not unless they are trying to do administrative ops. > 2. split/merge occurs > 3. another node fails. If master is down, these events are not processed. On reboot of master, it'll finish the processing of these event types. > So in case the master goes down and start back, i'm interested to > understand how long of unavailability the system will be (under the > scenario above) > For 2. from above, splits shouldn't cause off-line'd-ness (excuse the neologism). The regionserver edits .META. on split offlining parent and onlining the split daughters. It only bothers to tell the master about the splits so master can keep current the state of the cluster it keeps in its 'head' (when new master comes online, first thing it does is reconstitute this image). For item 3. above, its the master that runs the distributed log split. If no master, no one to run the split so those regions will be offline until a master comes online again, finds the offine server and runs a spit of its logs. St.Ack