Thanks for the clarifications St.Ack. Still I have some questions in regards of 3 in scenario discussed - when a region is offline it means that client operation are not possible on it (even read)? In case a second master is up (in an environment with multiple master), i presume all this occurs unless the second master (slave) become the master, right? how long those it take for a "slave" master to become a master??
Mikael.S On Sat, May 12, 2012 at 8:02 AM, Stack <st...@duboce.net> wrote: > On Fri, May 11, 2012 at 3:24 AM, Mikael Sitruk <mikael.sit...@gmail.com> > wrote: > > Sorry for not being precise enough. > > The point is that i'm trying to check the impact of HA scenarios. one of > > them is when the master goes down. > > That is true that the Master is not it the critical path of read/write > > unless (please correct me if i'm wrong): > > 1. new client are trying to connect > > Clients don't go to the master, not unless they are trying to do > administrative ops. > > > 2. split/merge occurs > > 3. another node fails. > > If master is down, these events are not processed. On reboot of > master, it'll finish the processing of these event types. > > > > So in case the master goes down and start back, i'm interested to > > understand how long of unavailability the system will be (under the > > scenario above) > > > > For 2. from above, splits shouldn't cause off-line'd-ness (excuse the > neologism). The regionserver edits .META. on split offlining parent > and onlining the split daughters. It only bothers to tell the master > about the splits so master can keep current the state of the cluster > it keeps in its 'head' (when new master comes online, first thing it > does is reconstitute this image). > > For item 3. above, its the master that runs the distributed log split. > If no master, no one to run the split so those regions will be > offline until a master comes online again, finds the offine server and > runs a spit of its logs. > > St.Ack >