Re: How does HBase deal with master switch?

2019-06-07 Thread Duo Zhang
Yes, in production it usually happens when there is a very long GC which causes the RS to die and all the regions have been assigned to other RSes before the RS is back and kills itself. Natalie Chen 于2019年6月7日周五 下午3:03写道: > The case about zookeeper is well known since data is actually saved > l

Re: How does HBase deal with master switch?

2019-06-07 Thread Natalie Chen
The case about zookeeper is well known since data is actually saved locally. But, I thought RS writes/reads data to /from HDFS so there’s no such problem as replication latency. Can we say that the only chance for getting stale data from RS is what you have described here and I only have to monit

Re: How does HBase deal with master switch?

2019-06-06 Thread Duo Zhang
Lots of distributed databases can not guarantee external consistency. Even for zookeeper, when you update A and then tell others to get A, the others may get a stale value since it may read from another replica which has not received the value yet. There are several ways to solve the problem in HB

Re: How does HBase deal with master switch?

2019-06-06 Thread Natalie Chen
Hi, I am quite concerned about the possibility of getting stale data. I was expecting consistency in HBase while choosing HBase as our nonsql db solution. So, if consistency is not guaranteed, meaning clients expecting to see latest data but, because of long gc or whatever, got wrong data instead

Re: How does HBase deal with master switch?

2019-06-06 Thread Zili Chen
Thanks for your reply and clarification! It sounds like a mechanism like fencing? I'd also like to look for JIRAs about this issue, that is, coordination in master switch. Maybe some like this one[1]? Best, tison. [1] https://issues.apache.org/jira/browse/HBASE-5549 Wellington Chevreuil 于201

Re: How does HBase deal with master switch?

2019-06-06 Thread Wellington Chevreuil
Hey Zili, Besides what Duo explained previously, just clarifying on some concepts to your previous description: 1) RegionServer started full gc and timeout on ZooKeeper. Thus ZooKeeper > regarded it as failed. > ZK just knows about sessions and clients, not the type of client connecting to it. Cl

Re: How does HBase deal with master switch?

2019-06-06 Thread Duo Zhang
Once a RS is started, it will create its wal directory and start to write wal into it. And if master thinks a RS is dead, it will rename the wal directory of the RS and call recover lease on all the wal files under the directory to make sure that they are all closed. So even after the RS is back af

How does HBase deal with master switch?

2019-06-06 Thread Zili Chen
Hi, Recently from the book, ZooKeeper: Distributed Process Coordination, I find a paragraph mentions that, HBase once suffered by 1) RegionServer started full gc and timeout on ZooKeeper. Thus ZooKeeper regarded it as failed. 2) ZooKeeper launched a new RegionServer, and the new one started to se