On 26 Aug 2016, at 12:58, kant kodali 
<kanth...@gmail.com<mailto:kanth...@gmail.com>> wrote:

@Steve your arguments make sense however there is a good majority of people who 
have extensive experience with zookeeper prefer to avoid zookeeper and given 
the ease of consul (which btw uses raft for the election) and etcd lot of us 
are more inclined to avoid ZK.

And yes any technology needs time for maturity but that said it shouldn't stop 
us from transitioning. for example people started using spark when it first 
released instead of waiting for spark 2.0 where there are lot of optimizations 
and bug fixes.



One way to look at the problem is "what is the cost if something doesn't work?"

If it's some HA consensus system, failure modes are "consensus failure, 
everything goes into minority mode and offline". service lost, data fine. 
Another  is "partition with both groups thinking they are in charge", which is 
more dangerous. then there's "partitioning event not detected", which may be 
bad.

so: consider the failure modes and then consider not so much whether the tech 
you are using is vulnerable to it, but "if it goes wrong, does it matter?"


Even before HDFS had HA with ZK/bookkeeper it didn't fail very often. And if 
you looked at the causes of those failures, things like backbone switch failure 
are so traumatic that things like ZK/etcd failures aren't going to make much of 
a difference. The filesystem is down.

Generally, integrity gets priority over availability. That said, S3 and the 
like have put availability ahead of consistency; Cassandra can offer that 
too.—sometimes it is the right strategy

Reply via email to