Thanks for your reply and clarification! It sounds like a mechanism like fencing?
I'd also like to look for JIRAs about this issue, that is, coordination in master switch. Maybe some like this one[1]? Best, tison. [1] https://issues.apache.org/jira/browse/HBASE-5549 Wellington Chevreuil <wellington.chevre...@gmail.com> 于2019年6月6日周四 下午10:15写道: > Hey Zili, > > Besides what Duo explained previously, just clarifying on some concepts to > your previous description: > > 1) RegionServer started full gc and timeout on ZooKeeper. Thus ZooKeeper > > regarded it as failed. > > > ZK just knows about sessions and clients, not the type of client connecting > to it. Clients open a session in ZK, then keep pinging back ZK > periodically, to keep the session alive. In the case of long full GC > pauses, the client (RS, in this case), will fail to ping back within the > required period. At this point, ZK will *expire *the session. > > 2) ZooKeeper launched a new RegionServer, and the new one started to serve. > > > ZK doesn't launch new RS, it doesn't know about RSes, only client sessions. > With the session expiration, Master will be notified that an RS is > potentially gone, and will start the process explained by Duo. > > 3) The old RegionServer finished gc and thought itself was still active and > > serving. > > > What really happens here is that once RS is back from GC, it will try ping > ZK again for that session, ZK will back it off because the session is > already expired, then RS will kill itself. > > > > > > Em qui, 6 de jun de 2019 às 14:58, 张铎(Duo Zhang) <palomino...@gmail.com> > escreveu: > > > Once a RS is started, it will create its wal directory and start to write > > wal into it. And if master thinks a RS is dead, it will rename the wal > > directory of the RS and call recover lease on all the wal files under the > > directory to make sure that they are all closed. So even after the RS is > > back after a long GC, before it kills itself because of the > > SessionExpiredException, it can not accept any write requests any more > > since its old wal file is closed and the wal directory is also gone so it > > can not create new wal files either. > > > > Of course, you may still read from the dead RS at this moment > > so theoretically you could read a stale data, which means HBase can not > > guarantee ‘external consistency’. > > > > Hope this solves your problem. > > > > Thanks. > > > > Zili Chen <wander4...@gmail.com> 于2019年6月6日周四 下午9:38写道: > > > > > Hi, > > > > > > Recently from the book, ZooKeeper: Distributed Process Coordination, I > > find > > > a paragraph mentions that, HBase once suffered by > > > > > > 1) RegionServer started full gc and timeout on ZooKeeper. Thus > ZooKeeper > > > regarded it as failed. > > > 2) ZooKeeper launched a new RegionServer, and the new one started to > > serve. > > > 3) The old RegionServer finished gc and thought itself was still active > > and > > > serving. > > > > > > in Chapter 5 section 5.3. > > > > > > I'm interested on it and would like to know how HBase community > overcame > > > this issue. > > > > > > Best, > > > tison. > > > > > >