Yes, in production it usually happens when there is a very long GC which causes the RS to die and all the regions have been assigned to other RSes before the RS is back and kills itself.
Natalie Chen <[email protected]> 于2019年6月7日周五 下午3:03写道: > The case about zookeeper is well known since data is actually saved > locally. > > But, I thought RS writes/reads data to /from HDFS so there’s no such > problem as replication latency. > > Can we say that the only chance for getting stale data from RS is what you > have described here and I only have to monitor RS heartbeat and control gc > pause? > > Thank you. > > > > 张铎(Duo Zhang) <[email protected]>於 2019年6月7日 週五,下午1:50寫道: > > > Lots of distributed databases can not guarantee external consistency. > Even > > for zookeeper, when you update A and then tell others to get A, the > others > > may get a stale value since it may read from another replica which has > not > > received the value yet. > > > > There are several ways to solve the problem in HBase, for example, record > > the time when we successfully received the last heartbeat from zk, and if > > it has been too long then we just throw exception to client. But this is > > not a big deal for most use cases, as in the same session, if you > > successfully update a value then you can see the new value when reading. > > For the external consistency, there are also several ways to solve it. > > > > So take your own risk, if you think external consistency is super > important > > to you, then you’d better choose another db. But please consider it > > carefully, as said above, lots of databases do not guarantee this > either... > > > > Natalie Chen <[email protected]>于2019年6月7日 周五11:59写道: > > > > > Hi, > > > > > > I am quite concerned about the possibility of getting stale data. I was > > > expecting consistency in HBase while choosing HBase as our nonsql db > > > solution. > > > > > > So, if consistency is not guaranteed, meaning clients expecting to see > > > latest data but, because of long gc or whatever, got wrong data instead > > > from a “dead” RS, even the chance is slight, I have to be able to > detect > > > and repair the situation or just consider looking for other more > suitable > > > solution. > > > > > > So, would you kindly confirm that HBase has this “consistency” issue? > > > > > > Thank you. > > > > > > > > > > > > 张铎(Duo Zhang) <[email protected]>於 2019年6月6日 週四,下午9:58寫道: > > > > > > > Once a RS is started, it will create its wal directory and start to > > write > > > > wal into it. And if master thinks a RS is dead, it will rename the > wal > > > > directory of the RS and call recover lease on all the wal files under > > the > > > > directory to make sure that they are all closed. So even after the RS > > is > > > > back after a long GC, before it kills itself because of the > > > > SessionExpiredException, it can not accept any write requests any > more > > > > since its old wal file is closed and the wal directory is also gone > so > > it > > > > can not create new wal files either. > > > > > > > > Of course, you may still read from the dead RS at this moment > > > > so theoretically you could read a stale data, which means HBase can > not > > > > guarantee ‘external consistency’. > > > > > > > > Hope this solves your problem. > > > > > > > > Thanks. > > > > > > > > Zili Chen <[email protected]> 于2019年6月6日周四 下午9:38写道: > > > > > > > > > Hi, > > > > > > > > > > Recently from the book, ZooKeeper: Distributed Process > Coordination, > > I > > > > find > > > > > a paragraph mentions that, HBase once suffered by > > > > > > > > > > 1) RegionServer started full gc and timeout on ZooKeeper. Thus > > > ZooKeeper > > > > > regarded it as failed. > > > > > 2) ZooKeeper launched a new RegionServer, and the new one started > to > > > > serve. > > > > > 3) The old RegionServer finished gc and thought itself was still > > active > > > > and > > > > > serving. > > > > > > > > > > in Chapter 5 section 5.3. > > > > > > > > > > I'm interested on it and would like to know how HBase community > > > overcame > > > > > this issue. > > > > > > > > > > Best, > > > > > tison. > > > > > > > > > > > > > > >
