Yes, in production it usually happens when there is a very long GC which
causes the RS to die and all the regions have been assigned to other RSes
before the RS is back and kills itself.
Natalie Chen 于2019年6月7日周五 下午3:03写道:
> The case about zookeeper is well known since data is actually saved
> l
The case about zookeeper is well known since data is actually saved locally.
But, I thought RS writes/reads data to /from HDFS so there’s no such
problem as replication latency.
Can we say that the only chance for getting stale data from RS is what you
have described here and I only have to monit
Lots of distributed databases can not guarantee external consistency. Even
for zookeeper, when you update A and then tell others to get A, the others
may get a stale value since it may read from another replica which has not
received the value yet.
There are several ways to solve the problem in HB
Hi,
I am quite concerned about the possibility of getting stale data. I was
expecting consistency in HBase while choosing HBase as our nonsql db
solution.
So, if consistency is not guaranteed, meaning clients expecting to see
latest data but, because of long gc or whatever, got wrong data instead
Thanks for your reply and clarification!
It sounds like a mechanism like fencing?
I'd also like to look for JIRAs about this issue, that is,
coordination in master switch. Maybe some like this one[1]?
Best,
tison.
[1] https://issues.apache.org/jira/browse/HBASE-5549
Wellington Chevreuil 于201
Hey Zili,
Besides what Duo explained previously, just clarifying on some concepts to
your previous description:
1) RegionServer started full gc and timeout on ZooKeeper. Thus ZooKeeper
> regarded it as failed.
>
ZK just knows about sessions and clients, not the type of client connecting
to it. Cl
Once a RS is started, it will create its wal directory and start to write
wal into it. And if master thinks a RS is dead, it will rename the wal
directory of the RS and call recover lease on all the wal files under the
directory to make sure that they are all closed. So even after the RS is
back af
Hi,
Recently from the book, ZooKeeper: Distributed Process Coordination, I find
a paragraph mentions that, HBase once suffered by
1) RegionServer started full gc and timeout on ZooKeeper. Thus ZooKeeper
regarded it as failed.
2) ZooKeeper launched a new RegionServer, and the new one started to se