Once a RS is started, it will create its wal directory and start to write wal into it. And if master thinks a RS is dead, it will rename the wal directory of the RS and call recover lease on all the wal files under the directory to make sure that they are all closed. So even after the RS is back after a long GC, before it kills itself because of the SessionExpiredException, it can not accept any write requests any more since its old wal file is closed and the wal directory is also gone so it can not create new wal files either.
Of course, you may still read from the dead RS at this moment so theoretically you could read a stale data, which means HBase can not guarantee ‘external consistency’. Hope this solves your problem. Thanks. Zili Chen <wander4...@gmail.com> 于2019年6月6日周四 下午9:38写道: > Hi, > > Recently from the book, ZooKeeper: Distributed Process Coordination, I find > a paragraph mentions that, HBase once suffered by > > 1) RegionServer started full gc and timeout on ZooKeeper. Thus ZooKeeper > regarded it as failed. > 2) ZooKeeper launched a new RegionServer, and the new one started to serve. > 3) The old RegionServer finished gc and thought itself was still active and > serving. > > in Chapter 5 section 5.3. > > I'm interested on it and would like to know how HBase community overcame > this issue. > > Best, > tison. >