@Mich ofcourse and In my previous message I have given a context as well.
Needless to say, the tools that are used by many banks that I came across such
as Citi, Capital One, Wells Fargo, GSachs are pretty laughable when it comes to
compliance and security. They somehow think they are secure when they aren't.





On Fri, Aug 26, 2016 5:46 AM, Mich Talebzadeh mich.talebza...@gmail.com wrote:
And yes any technology needs time for maturity but that said it shouldn't stop
us from transitioning............
Depends on the application and how mission critical the business it is deployed
for. If you are using a tool for a Bank's Credit Risk (Surveillance, Anti-Money
Laundering, Employee Compliance, Anti-Fraud etc) and the tool missed a big chunk
for whatever reason then, the first thing will be the Bank will be fined in
($millions) and I will be looking for a new Job in London transport.
On the hand if the tools is used for some social media, sentiment analysis and
all that sort of stuff, I don't think anyone is going to lose sleep.
HTH








Dr Mich Talebzadeh



LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com




Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any
other property which may arise from relying on this email's technical content is
explicitly disclaimed. The author will in no case be liable for any monetary
damages arising from such loss, damage or destruction.




On 26 August 2016 at 12:58, kant kodali < kanth...@gmail.com > wrote:
@Steve your arguments make sense however there is a good majority of people who
have extensive experience with zookeeper prefer to avoid zookeeper and given the
ease of consul (which btw uses raft for the election) and etcd lot of us are
more inclined to avoid ZK.
And yes any technology needs time for maturity but that said it shouldn't stop
us from transitioning. for example people started using spark when it first
released instead of waiting for spark 2.0 where there are lot of optimizations
and bug fixes.





On Fri, Aug 26, 2016 2:50 AM, Steve Loughran ste...@hortonworks.com wrote:

On 25 Aug 2016, at 22:49, kant kodali < kanth...@gmail.com > wrote:
yeah so its seems like its work in progress. At very least Mesos took the
initiative to provide alternatives to ZK. I am just really looking forward for
this.
https://issues.apache.org/ jira/browse/MESOS-3797






I worry about any attempt to implement distributed consensus systems: they take
time in production to get right.
1. There's the need to prove that what you are building is valid if the
implementation matches the specification. That has apparently been done for ZK,
though given the complexity of maths involved, I cannot vouch for that myself: 
https://blog.acolyer.org/2015/ 03/09/zab-high-performance-
broadcast-for-primary-backup- systems/
2. you need to run it in production to find the problems. Google's Chubby paper
hints about the things they found out went wrong there. As far as ZK goes,
jepsen hints its robust
https://aphyr.com/posts/291- jepsen-zookeeper
If it has weaknesses, I'd point at - it's security model -it's lack of 
helpfulness when there are kerberos/SASL auth problems (ZK server
closes connection; client sees connection failure and retries), -the fact that 
it's failure modes aren't always understood by people coding
against it.
http://blog.cloudera.com/blog/ 2014/03/zookeeper-resilience- at-pinterest/
the Raft algorithm appears to be easier to implement than Paxos; there are
things built on it and I look forward to seeing what works/doesn't work in
production.
Certainly Aphyr found problems when it pointed jepsen at etcd, though being a
2014 piece of work, I expect those specific problems to have been addressed. The
main thing is: it shows how hard it is to get things right in the presence of
complex failures.
Finally, regarding S3
You can use S3 object store as a source of data in queries/streaming, and, if
done carefully, a destination. Performance is variable...something some of us
are working on there, across S3a, spark and hive.
Conference placement: I shall be talking on that topic at Spark Summit Europe if
you want to find out more: https://spark-summit. org/eu-2016/

On Thu, Aug 25, 2016 2:00 PM, Michael Gummelt mgumm...@mesosphere.io wrote:
Mesos also uses ZK for leader election. There seems to be some effort in
supporting etcd, but it's in progress: https://issues. 
apache.org/jira/browse/MESOS- 1806

On Thu, Aug 25, 2016 at 1:55 PM, kant kodali < kanth...@gmail.com > wr ote:
@Ofir @Sean very good points.
@Mike We dont use Kafka or Hive and I understand that Zookeeper can do many
things but for our use case all we need is for high availability and given the
devops people frustrations here in our company who had extensive experience
managing large clusters in the past we would be very happy to avoid Zookeeper. I
also heard that Mesos can provide High Availability through etcd and consul and
if that is true I will be left with the following stack




Spark + Mesos scheduler + Distributed File System or to be precise I should say
Distributed Storage since S3 is an object store so I guess this will be HDFS for
us + etcd & consul. Now the big question for me is how do I set all this up

Reply via email to