@Steve your arguments make sense however there is a good majority of people who
have extensive experience with zookeeper prefer to avoid zookeeper and given the
ease of consul (which btw uses raft for the election) and etcd lot of us are
more inclined to avoid ZK.
And yes any technology needs time for maturity but that said it shouldn't stop
us from transitioning. for example people started using spark when it first
released instead of waiting for spark 2.0 where there are lot of optimizations
and bug fixes.





On Fri, Aug 26, 2016 2:50 AM, Steve Loughran ste...@hortonworks.com wrote:

On 25 Aug 2016, at 22:49, kant kodali < kanth...@gmail.com > wrote:
yeah so its seems like its work in progress. At very least Mesos took the
initiative to provide alternatives to ZK. I am just really looking forward for
this.
https://issues.apache.org/jira/browse/MESOS-3797






I worry about any attempt to implement distributed consensus systems: they take
time in production to get right.
1. There's the need to prove that what you are building is valid if the
implementation matches the specification. That has apparently been done for ZK,
though given the complexity of maths involved, I cannot vouch for that myself: https://blog.acolyer.org/2015/03/09/zab-high-performance-broadcast-for-primary-backup-systems/
2. you need to run it in production to find the problems. Google's Chubby paper
hints about the things they found out went wrong there. As far as ZK goes,
jepsen hints its robust
https://aphyr.com/posts/291-jepsen-zookeeper
If it has weaknesses, I'd point at - it's security model -it's lack of 
helpfulness when there are kerberos/SASL auth problems (ZK server
closes connection; client sees connection failure and retries), -the fact that 
it's failure modes aren't always understood by people coding
against it.
http://blog.cloudera.com/blog/2014/03/zookeeper-resilience-at-pinterest/
the Raft algorithm appears to be easier to implement than Paxos; there are
things built on it and I look forward to seeing what works/doesn't work in
production.
Certainly Aphyr found problems when it pointed jepsen at etcd, though being a
2014 piece of work, I expect those specific problems to have been addressed. The
main thing is: it shows how hard it is to get things right in the presence of
complex failures.
Finally, regarding S3
You can use S3 object store as a source of data in queries/streaming, and, if
done carefully, a destination. Performance is variable...something some of us
are working on there, across S3a, spark and hive.
Conference placement: I shall be talking on that topic at Spark Summit Europe if
you want to find out more: https://spark-summit.org/eu-2016/

On Thu, Aug 25, 2016 2:00 PM, Michael Gummelt mgumm...@mesosphere.io wrote:
Mesos also uses ZK for leader election. There seems to be some effort in
supporting etcd, but it's in progress: 
https://issues.apache.org/jira/browse/MESOS-1806

On Thu, Aug 25, 2016 at 1:55 PM, kant kodali < kanth...@gmail.com > wrote:
@Ofir @Sean very good points.
@Mike We dont use Kafka or Hive and I understand that Zookeeper can do many
things but for our use case all we need is for high availability and given the
devops people frustrations here in our company who had extensive experience
managing large clusters in the past we would be very happy to avoid Zookeeper. I
also heard that Mesos can provide High Availability through etcd and consul and
if that is true I will be left with the following stack




Spark + Mesos scheduler + Distributed File System or to be precise I should say
Distributed Storage since S3 is an object store so I guess this will be HDFS for
us + etcd & consul. Now the big question for me is how do I set all this up

Reply via email to