Re: write performance issue in 3.6.2

2021-05-03 Thread Michael Han
gt; > > > > The following are some updates on the issue. > > > > > > 1. We've checked the fine grained metrics and found that the > > > CommitProcessor was the bottleneck. The commit_commit_proc_req_queued > and > > > the write_commitproc_time_m

Re: write performance issue in 3.6.2

2021-05-03 Thread Michael Han
ottler. > > > Please let me know if you or anyone has any questions. > > Thanks, > > Li > > > > On Tue, Apr 20, 2021 at 8:03 PM Michael Han wrote: > > > What is the workload looking like? Is it pure write, or mixed read write? > > > > A coup

Re: write performance issue in 3.6.2

2021-04-20 Thread Michael Han
What is the workload looking like? Is it pure write, or mixed read write? A couple of ideas to move this forward: * Publish the performance benchmark so the community can help. * Bisect git commit and find the bad commit that caused the regression. * Use the fine grained metrics introduced in 3.6

Re: Correlate Ephemeral Owner with connected session

2020-11-01 Thread Michael Han
>> but I'm not sure if the sid is exposed anywhere in the API (if it is, I haven't found it yet and would appreciate guidance). The session id can be retrieved through the Stat object passed to various ZooKeeper APIs (like getData) - once you get a Stat object call getEphemeralOwner would return t

Re: Ordering guarantees for watch notifications

2020-10-23 Thread Michael Han
>> What's the order there? Watchers are triggered as at the end of processing a transaction (create / delete / setData and so on), after the data tree is updated. A fired watcher event will be queued on servers response queue (which guarantees FIFO order for the same session). The client - server

Re: Sequential Consistency Guarantees

2020-08-27 Thread Michael Han
Do you mind providing a concrete example of the "evidence that's pointing towards the opposite direction" just to make sure we are on the same page on the topic. On a side note, the latest document describing consistency guarantees is here: https://github.com/apache/zookeeper/blob/master/zookeeper

Re: How to deliberately cause a split brain?

2020-06-18 Thread Michael Han
There are different cases for split brain and how to test the monitor code depends on what signals you are using - but most usually, a split brain case can be created by artificially split two quorum out of a single quorum through manual configuration change (e.g. a 7 quorum servers can be split in

Re: Need Help with Maven Build

2020-05-19 Thread Michael Han
hi jun - which maven version you are using? If it's 3.5.x, try upgrade to 3.6.x. I had the exact same issue a while back and upgrade maven fixed this, so I didn't bother to debug. That said, it's interesting to understand why we failed under specific version of maven / env, so cc dev list where we

Re: Using 100's of ZK Observers

2020-04-10 Thread Michael Han
If you have 100s of 1000s of ZK clients then having observer in each pod will presumably reduce traffic as most of the fan out traffic, from server to clients is localized to each pod. Observer is not part of quorum, and a quorum can't scale pass a few servers (typical just 5 or 7). Observers can

Re: question on ZAB protocol

2020-02-17 Thread Michael Han
>> so the client and the cluster has an inconsistent view. I would be reluctant to conclude this is an inconsistent view as a client should always consult server to get the latest state, rather than derive the state from the response of the request, which is not reliable if the request "fails" as

Re: Zookeeper resolving to old host IP addresses

2020-01-21 Thread Michael Han
Could be ZOOKEEPER-1506, though this should be fixed already in 3.4.14. On Tue, Jan 21, 2020 at 2:01 PM rammohan ganapavarapu < rammohanga...@gmail.com> wrote: > Hi Enrico, > > I see same with both 3.4.5 and 3.4.14 > > Ram > > On Tue, Jan 21, 2020 at 1:53 PM Enrico Olivelli > wrote: > > > Hi, >

Re: [ANNOUNCE] Enrico Olivelli new ZooKeeper PMC Member

2020-01-21 Thread Michael Han
Congrats, Enrico! On Tue, Jan 21, 2020 at 1:57 PM Jordan Zimmerman wrote: > Well deserved. Congratulations. > > > Jordan Zimmerman > > > On Jan 21, 2020, at 4:40 PM, Flavio Junqueira wrote: > > > > I'm pleased to announce that Enrico Olivelli recently became the newest > Z

Re: clientCnxnSocket#updateLastSendAndHeard() method usage

2019-12-12 Thread Michael Han
We had some prod issues previously related to the usage of cached "now" variable (and lacking of consistent accessing pattern for updateNow) used in java client - we had a patch internally (basically what's described in ZOOKEEPER-2471) that removed usage of the cache value and instead calculate "no

Re: Snapshot creation after 3.4.14 -> 3.5.6 upgrade

2019-12-12 Thread Michael Han
>> My question is: Is there a way to force the snapshot creation / sync from the leader? 3.5.6 will automatically create a clean snapshot as part of server start up process. So a snapshot should be available after initial upgrade and there is no need to force a snapshot creation. On Mon, Dec 9, 2

Re: Any interest in a gRPC version of ZooKeeper

2019-11-26 Thread Michael Han
ree/master/zookeeper < > https://github.com/Randgalt/zkgrpc/tree/master/zookeeper> > > Or, am I missing something you're seeing? > > -Jordan > > > On Nov 24, 2019, at 4:40 PM, Michael Han wrote: > > > >>> That's 100% protobuf/gRPC > &g

Re: Any interest in a gRPC version of ZooKeeper

2019-11-24 Thread Michael Han
>> That's 100% protobuf/gRPC Yes agree. Sorry, I should be probably more clear. What I meant "serialization format" in this case is jute's Record, which is still used in the POC code base. The wire serialization format is protobuf, and it's converted to / from Record through the rpc mapper utility

Re: Any interest in a gRPC version of ZooKeeper

2019-11-20 Thread Michael Han
>> The goal is to make it possible to easily write ZooKeeper clients in non-JVM languages. The proof of concept is still using jute as serialization format, which makes write a client library harder. Use protobuf as serialization format might achieve this goal the marshal / unmarshal code can be g

Re: How to scale ZooKeeper to support 10K concurrent connections?

2019-09-27 Thread Michael Han
>> can launch tens of thousands of calls Is it possible for you to quantify this in a form of (read and write) request per second, and the average request payload if it's OK to disclose? This information is critical on shaping the best scaling solution. Without knowing any of ballpark numbers of

Re: PoweredBy Zookeeper

2019-09-24 Thread Michael Han
link to the doc: https://github.com/apache/zookeeper/blob/master/zookeeper-docs/src/main/resources/markdown/zookeeperUseCases.md On Tue, Sep 24, 2019 at 4:36 AM Enrico Olivelli wrote: > Cool > > I am not sure, do we have to wait for 3.6 release before updating the > website? > > Enrico > > Il m

Re: how session expire works in zookeeper codebase?

2019-09-24 Thread Michael Han
>> However, session could expire after connection established. Where is the latter case trigger? ZooKeeper checks every incoming request and the appertained session. When a session is expiring, all requests appertain to this session will fail session check and the responses generated will contain

Re: Ephemeral znode deleted infers session expired?

2019-09-20 Thread Michael Han
>> I'd like to know whether an ephemeral znode deleted infers its corresponding session expired. Yes as far as I know - assuming no one else was messing up with the same ephemeral node. On Thu, Sep 19, 2019 at 7:39 AM Zili Chen wrote: > Of course it is ensured that no other operations delete th

Re: Leader election and leader operation based on zookeeper

2019-09-20 Thread Michael Han
>> thus contender-1 commit a write operation even if it is no longer the leader I am assuming the "write operation" here is write to ZooKeeper (as opposed to write to an external storage system)? If so: >> contender-1 recovers from full gc, before it reacts to revoke leadership event, txn-1 retri

Re: Re: a misunderstanding of ZAB

2019-09-05 Thread Michael Han
will be committed; if he has not p1, then p1 will be dropped. > > so for a client, if write query takes too much time, the client may > > receive Timeout Exception, and it must query servers again to know > whether > > previous write is SUCCESS or FAIL? > > > &g

Re: a misunderstanding of ZAB

2019-09-03 Thread Michael Han
+1 with what Alex has said. The commit case is easy to understand. For skip case, think this example: old quorum: F1 F2 F3 F4 F5, with F1 as L1. L1 has p on F1 and F2. new quorum: F1 F2 F3 F4 F5, with F3 as L2. It's possible, because although F1 and F2 has latest zxid, they could be partitioned a

Re: [Question] How watches work?

2019-08-29 Thread Michael Han
>> why we also trigger watches on server side. Because server side needs generate the watched events and deliver the events to client side, so client side can trigger watcher. >> Any advice on how watches work, both client side and server side? At high level think it as a push based change notif

Re: create or setData in transaction?

2019-08-14 Thread Michael Han
ause ZK is linearized anyway). > > > > That leaves isolated which is kind of hard to talk about with ZK since > all > > operations are fast and sequential. > > > > On Wed, Aug 14, 2019 at 3:12 PM Michael Han wrote: > > > > > ... > > >

Re: create or setData in transaction?

2019-08-14 Thread Michael Han
>> Is there any way to do an "if-else" transaction? No for your use case. The only remotely related conditional operation you can express with multi-op is by using check operator (Op.check), where you can check a zNode's version and only execute subsequent operation in multi op when version matche

Re: Can SSL capability be satisfied by a smaller dependency than netty-all?

2019-08-01 Thread Michael Han
>> SSL capability can be satisfied by one of the smaller netty jars, rather than netty-all A brief look on the imports indicates that we might only need the handler and transport jars from Netty. I'd suggest to create a JIRA to request this change. On Tue, Jul 30, 2019 at 1:11 PM Shawn Heisey wr

Re: Issue migrating from Zookeeper 3.4.14 to 3.5.5

2019-07-29 Thread Michael Han
. Is it > because the format changed in 3.5.5 compared to 3.4.14? > > On Mon, Jul 29, 2019 at 11:25 PM Michael Han wrote: > > > >> java.io.IOException: No snapshot found, but there are log entries. > > Something is broken! > > > > This is expected behavio

Re: Issue migrating from Zookeeper 3.4.14 to 3.5.5

2019-07-29 Thread Michael Han
>> java.io.IOException: No snapshot found, but there are log entries. Something is broken! This is expected behavior introduced in ZOOKEEPER-2325. We don't want to end up with potential inconsistent state across the ensemble when recovering from empty snapshot. To continue upgrade, just delete al

Re: Zookeeper latency calculation

2019-07-17 Thread Michael Han
nt, not float This is now a float on master branch, and the change was made in ZOOKEEPER-2641. I remember this because this actually breaks one of our internal metrics system, where our system expects an int (the old type). On Wed, Jul 17, 2019 at 8:43 PM Michael Han wrote: > >> alwa

Re: Zookeeper latency calculation

2019-07-17 Thread Michael Han
>> always give avg_latency "0" The latency metrics depends on workloads. On Wed, Jul 17, 2019 at 1:34 AM Enrico Olivelli wrote: > Il mar 16 lug 2019, 19:05 rammohan ganapavarapu > ha scritto: > > > Hi, > > > > I am trying to understand how zookeeper latency calculated, mntr command > > always

Re: Is 3.5.5 client compatible with 3.4.x servers

2019-07-17 Thread Michael Han
In theory, they should be compatible. See backward compatibility section https://cwiki.apache.org/confluence/display/ZOOKEEPER/ReleaseManagement (where Major is 3, Minor is 5 and Minor - 1 is 4, in this case). In practice, this one looks incompatible. The Op code 15 in 3.5.5 client was added in h

Re: Is there a recommended open source GUI tool for monitoring 'zookeeper'?

2019-01-09 Thread Michael Han
You might want to check out exhibitor: https://github.com/soabase/exhibitor On Wed, Jan 9, 2019 at 4:55 PM 유정인 wrote: > Hi > > Is there a recommended open source GUI tool for monitoring 'zookeeper'? > > > >

Re: Getting Authentication Not valid while running reconfig Command

2018-11-06 Thread Michael Han
Please check out the reconfig release document for 3.5.3 beta, in particular section "Access Control": https://zookeeper.apache.org/doc/r3.5.3-beta/zookeeperReconfig.html *"The dynamic configuration is stored in a special znode ZooDefs.CONFIG_NODE = /zookeeper/config. This node by default is read

Re: [Help Wanted] Will zookeeper merge change events?

2018-10-23 Thread Michael Han
Hi Jun, >> will it only notify the client of the 100th event or all events from 2 - 100 will be notified? All events will be notified. Each watched event will be materialized as a server side response and on client side, each watched event will be processed individually. Depend on how your set w

Re: document for zk internals

2018-10-04 Thread Michael Han
>> which mentioned LeaderElection and FastLeaderElection. The document here is a little bit outdated. We deprecated the old LE implementation (LeaderElection) after 3.4.0 release, and the only leader election in use (for both stable 3.4.x and 3.5/6) is now FastLeaderElection. So now we only have a

Re: Digest auth with classic TCP transport

2018-09-27 Thread Michael Han
>> I have not found any evidence that Zookeeper server nor (Java) client supports TLS in version 3.4.13. We support TLS for client-server (and soon server-server) connections on 3.5 releases. There is no plan to back port these features to 3.4 which is the current stable branch, because we only ba

Re: can not know the process name from zk log

2018-09-12 Thread Michael Han
I have a patch that basically did what OP wanted - allows client to pass more detailed information to server for client tracking. It's a useful feature, for debugging and in future, for ZK to support multi-tenancy and enforced quota. I'll try upstream that patch via https://issues.apache.org/jira/b

Re: Zookeeper consistency

2018-07-17 Thread Michael Han
>> I think that Zookeeper is linearable only if there are only write operation. Yes writes are linearizable because writes are totally ordered globally. For read, a linearizable read needs to read the latest writes in the system at the point in time the read is issued; so by this definition, ZK r

[ANNOUNCE] Apache ZooKeeper 3.4.13

2018-07-16 Thread Michael Han
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version 3.4. 13. ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services - such as naming, configuration management, synchronization, and group services - in a simple interface so

Re: Dose client read dirty data in zk release-3.5.4 ?

2018-06-17 Thread Michael Han
Data synchronization is already done if the execution hits the zk.startup (note the previous while loop will only break if learner receives leaders up to date message). On Wed, Jun 13, 2018 at 10:51 PM, yuzhou li wrote: > The main code is at Learner.java syncWithLeader like this: > if (qp.getTyp

Re: what's the different between acceptedEpoch and currentEpoch?

2018-06-10 Thread Michael Han
The two variables serve different purposes. acceptedEpoch stores the epoch of the last NEWEPOCH message received and currentEpoch stores the epoch of the last NEWLEADER message received. They were introduced in ZOOKEEPER-335, please check that JIRA if you are interested. I think the ZAB protocol i

Re: A question about cross-version client/server compatibility

2018-05-22 Thread Michael Han
Please check out Backward Compatibility section in https://cwiki.apache.org/confluence/display/ZOOKEEPER/ReleaseManagement. A few other comments inline. On Tue, May 22, 2018 at 2:47 PM, Shawn Heisey wrote: > Somebody on the solr-user mailing list has posed a question about > whether they can us

Re: Zookeeper 3.5.3 reconfig blocked by ACL

2017-10-17 Thread Michael Han
>> The way this is set up it seems only a superuser enabled cluster can use the reconfig command. You can also configure the ACL associated with the "/config" znode so your chosen users have permission to both read and write the config znode, after they are authenticated (using your favorite authe

Re: Zookeeper 3.5.3-beta reconfigure command

2017-10-13 Thread Michael Han
Please note that the link to the trunk doc https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html is very out of date - please use the documents packaged within the release. In 3.5.3 beta we disabled the reconfig by default, to use it you need enable the feature first (see the doc in the re

Re: How to prevent others from accessing our zookeeper service?

2017-08-21 Thread Michael Han
no current way to keep anonymous users > from connecting at all. > > There have been numerous proposals to use SASL to solve this problem and > there is an open PR by Michael Han > (https://github.com/apache/zookeeper/pull/118), but nothing of the sort > has been committed yet. > &g

Re: Upgrade of Zookeeper and Kafka

2017-08-16 Thread Michael Han
I think you are in the wrong thread. What Patrick replied is this: http://zookeeper-user.578899.n2.nabble.com/Upgrade-of-Zookeeper-and-Kafka-td7583242.html, and what you asked is: http://zookeeper-user.578899.n2.nabble.com/Error-connecting-to-ZooKeeper-server-td7583243.html On Wed, Aug 16, 2017 at

Re: Using ClientCnxnSocketNetty over ClientCnxnSocketNIO in 3.5

2017-07-21 Thread Michael Han
feature. On Thu, Jul 20, 2017 at 1:18 PM, Enrico Olivelli wrote: > Michael, > Thank you for your quick response > > Il gio 20 lug 2017, 19:15 Michael Han ha scritto: > > > >> Is any plan to move to ClientCnxnSocketNetty but default ? > > > > The plan

Re: ZooKeeper Time Synchronization

2017-07-21 Thread Michael Han
One clarification on "System Time" here - ZK uses two type of time/clock * The wall-clock time, which is recorded as part of zNode stats such as mtime and is exposed to users. * The monotonic clock which ZK uses in various uses (e.g. failure detection) to measure intervals. Note in 3.4 ZK still us

Re: ZooKeeper Time Synchronization

2017-07-21 Thread Michael Han
mtime etc is exposed to user to provide basic stats info; ZK itself does not use these times. These times will just be recorded as they are and carried over and does not impact anything in case leader election etc happens. On Fri, Jul 21, 2017 at 11:30 AM, Amr wrote: > Hi Abe, > > Thanks a lot f

Re: Using ClientCnxnSocketNetty over ClientCnxnSocketNIO in 3.5

2017-07-20 Thread Michael Han
>> Is any plan to move to ClientCnxnSocketNetty but default ? The plan was to replace NIO engine. See ZOOKEEPER-733. For some features (like client-server SSL) it is a requirement to switch to Netty. Netty socket implementation is less mature comparing to NIO (there are bugs reported overtime and

Re: What is the release cadence?

2017-07-17 Thread Michael Han
Most recently we do a stable release approximately every six months. It's a good time to start planning next 3.4 release, which will include many important bug fixes. I'll start a discussion on dev list regarding that topic later. On Mon, Jul 17, 2017 at 3:22 PM, Ben Sherman wrote: > What's the

Re: java.io.EOFException

2017-06-29 Thread Michael Han
On Wed, Jun 28, 2017 at 11:59 PM, Mike Richardson wrote: > Unsubscribe > > > Unsubscribe does not work like this. To unsubscribe, please click the Unsubscribe from List <%75%73%65%72%2D%75%6E%73%75%62%73%63%72%69%62%65%40%7A%6F%6F%6B%65%65%70%65%72%2E%61%70%61%63%68%65%2E%6F%72%67> link from http

Re: How to add nodes to a Zookeeper 3.5.3-beta ensemble with reconfigEnabled=false

2017-06-23 Thread Michael Han
On Fri, Jun 23, 2017 at 6:09 AM, Shawn Heisey wrote: > On 6/22/2017 11:39 PM, Alexander Shraer wrote: > > The described behavior is the intended one - in 3.5 configuration is > > part of the synced state and is updated when the server syncs with the > > leader. The only rolling upgrade I tested w

Re: How to add nodes to a Zookeeper 3.5.3-beta ensemble with reconfigEnabled=false

2017-06-23 Thread Michael Han
curity and would prefer to get rid of the flag. But if you must have > it, > > we have to prevent both in memory config updates (most important) and > > config file updates if reconfig is disabled. This sounds like a small > > change in quorumpeer, but perhaps I'm forgettin

Re: How to add nodes to a Zookeeper 3.5.3-beta ensemble with reconfigEnabled=false

2017-06-23 Thread Michael Han
. > > Cheers > Alex > > > On Thu, Jun 22, 2017 at 11:06 PM Michael Han wrote: > > > Hi Alex, thanks for clarification! > > > > It makes sense to me that users should use reconfig instead of rolling > > upgrade moving forward. The only concern is

Re: How to add nodes to a Zookeeper 3.5.3-beta ensemble with reconfigEnabled=false

2017-06-22 Thread Michael Han
e reconfig ? > > Alex > > > > > On Thu, Jun 22, 2017 at 10:18 PM, Michael Han wrote: > > > reconfigEnabled only disables reconfig command when > reconfigEnabled=false; > > it does not disable the feature by mute all code paths of the reconfig > > feature in

Re: How to add nodes to a Zookeeper 3.5.3-beta ensemble with reconfigEnabled=false

2017-06-22 Thread Michael Han
reconfigEnabled only disables reconfig command when reconfigEnabled=false; it does not disable the feature by mute all code paths of the reconfig feature introduced in ZOOKEEPER-107. So regardless of the value of reconfigEnabled, 3.5.x ZK will create static config file and dynamic config file in an

Re: How to add nodes to a Zookeeper 3.5.3-beta ensemble with reconfigEnabled=false

2017-06-21 Thread Michael Han
You can still do rolling restarts for 3.5.x including 3.5.3-beta. Rolling restart requires edits the zoo.cfg - the static configuration files, instead of zoo.cfg.dynamic.x, which is the dynamic reconfiguration file that stores reconfig parameters. This dynamic config file is managed by ZK and is no

Re: Client hangs waiting for connection

2017-06-20 Thread Michael Han
Sounds like a dead lock on client library. One idea is to instrument your client code and dump the thread stack when the wait timeouts. The stack will hopefully contain the states of various threads and provide some insights on what to look for next. On Tue, Jun 20, 2017 at 3:14 PM, John Lindwall

Re: Zookeeper is always CP or AP in terms of CAP theorem

2017-06-19 Thread Michael Han
Martin had a good blog post about this - see the ZooKeeper case study section. https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html On Mon, Jun 19, 2017 at 11:47 AM, Kaushal Shriyan wrote: > Hi, > > I am reading the CAP theorem and zookeeper either satisfies CP or

Re: How to secure zookeeper?

2017-06-13 Thread Michael Han
We just published a blog about 4lw and security today which provides more context about history and possible solutions, hope this also helps. https://blog.cloudera.com/blog/2017/06/apache-zookeeper-four-letter-words-and-security/ On Sat, Jun 3, 2017 at 9:43 AM, Novin Novin wrote: > thanks Flavi

Re: Ephemeral node not auto deleted after change the system time?

2017-05-19 Thread Michael Han
Please check https://issues.apache.org/jira/browse/ZOOKEEPER-2744 - if you are using 3.4.x this should be fixed in next release (3.4.11). On Fri, May 19, 2017 at 2:21 AM, Sigmond Hola wrote: > Scenario: > > 1. Client connect to zk server, and created a ephemeral sequential node; > 2. Change syst

Re: Observers taking long time to serve requests

2017-05-17 Thread Michael Han
ing leader to snapshotting is taking 30mins. > > Ram > > On May 16, 2017 2:07 PM, "Michael Han" wrote: > > > When an observer (and in general a follower) restarted, it will go > through > > these stages: > > > > * Look for leader by starting

Re: Observers taking long time to serve requests

2017-05-16 Thread Michael Han
When an observer (and in general a follower) restarted, it will go through these stages: * Look for leader by starting a new leader election round and usually this is quick as there is already a leader. * Register with leader and begin synchronize phase - depends on the observer state the sync mig

Re: Follower drops out of quorum, can't reconnect

2017-05-10 Thread Michael Han
I would suggest create a JIRA issue and attach the full log of sid 5 (if that's possible). The log posted here does not have enough information to analyze what happened on sid 5 during the 15 minutes when it's trying to connect to an established quorum. Please also attach another one or two servers

Re: EOFException on snapshot dump

2017-04-25 Thread Michael Han
No debug mode afaik. I suspect the snapshot was corrupted - it's partial so the read was expecting more bytes than the file actually has, thus EOFException. A workaround would be patch SnapshotFormatter so it caught the exception and print what's already loaded and parsed, instead of bail out and t

Re: What is the role of Zookeeper and its external Integration dependencies

2017-04-24 Thread Michael Han
Some notes on the CVE - it's only affecting the C client shell, which is not part of the C client API. Even if some of the projects mentioned here use C client API (which afaik does not), they should not be impacted by this specific CVE from a functional point of view. On Fri, Apr 21, 2017 at 6:48

Re: zookeeper node fails to communicate with Leader node

2017-04-20 Thread Michael Han
The script should be simple enough to debug. Maybe try executing the command yourself and see what happens? Could it be that JAVA_HOME was not set correctly? On Tue, Apr 18, 2017 at 1:24 PM, ravisinha0506 wrote: > I have a zookeeper cluster which includes 3 nodes. Zookeeper config is > mentione

Re: [ANNOUNCE] Apache ZooKeeper 3.5.3-beta

2017-04-20 Thread Michael Han
ine script was a nice feature to have by > default. > > On Wed, Apr 19, 2017 at 6:02 PM, Michael Han wrote: > > > >> pitfalls coming from 3.4.9 (or .10) to the 3.5.x release? > > If coming from 3.4.9, one note is all four letter words except srvr are > > disab

Re: [ANNOUNCE] Apache ZooKeeper 3.5.3-beta

2017-04-19 Thread Michael Han
ere any docs written yet or any known pitfalls coming > > from 3.4.9 (or .10) to the 3.5.x release? > > > > On Mon, Apr 17, 2017 at 10:48 AM, Michael Han wrote: > > > > > The Apache ZooKeeper team is proud to announce Apache ZooKeeper version > > > *3.5.3-b

[ANNOUNCE] Apache ZooKeeper 3.5.3-beta

2017-04-17 Thread Michael Han
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version *3.5.3-beta*. ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services - such as naming, configuration management, synchronization, and group services - in a simple interfac

Re: Two way (mutual) SSL authentication

2017-04-08 Thread Michael Han
Please check out https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeper+SSL+User+Guide On Fri, Apr 7, 2017 at 1:12 PM, martin wrote: > Hello Is Zookeeper supporting 2-way (mutual or client authentication) > authentication SSL?I would like to use as a simple way to restrict clients > acc

Re: Zookeeper C-client API zookeeper_close does not always close the session at server

2017-04-06 Thread Michael Han
>> The documentation for zookeeper_close seems to indicate that the call will block until the session is cleaned up at the server or a failure occurs There is no guarantee that after the call of zookeeper_close the session will be cleaned up. Similar for Java client's close as well. ZOK return cod

Re: Automatically obtaining zookeeper server version

2017-04-04 Thread Michael Han
urrently > building out our infrastructure monitoring, and it would be useful to have > a more specific target date: > > Thanks, > > Marcos > > On Mon, Apr 3, 2017 at 11:56 AM, Michael Han wrote: > > > Server version is also exposed through JMX - that might be a be

Re: Automatically obtaining zookeeper server version

2017-04-03 Thread Michael Han
Server version is also exposed through JMX - that might be a better alternative than using four letter words, which will be deprecated in future. On Mon, Apr 3, 2017 at 10:16 AM, Keith Turner wrote: > A bit ago I wrote a blog post[1] about building shaded jars to run > Fluo applications with Spa

Re: Client backward compat with server

2017-03-28 Thread Michael Han
Releases with fixed major.minor version are backward compatible - so 3.4.9 is backward compatible with 3.4.6 (major=3, minor=4 in this case.). Backward compatible means two different versions of client and server can be mixed - in your case it could be 3.4.9 client with 3.4.6 server. So this should

Re: shutdown Observer

2017-03-09 Thread Michael Han
> > datacenters, unless you know you have a solid network between them. If > your > > observers are falling offline "randomly", packet loss is a pretty likely > > culprit. > > > > On Thu, Mar 9, 2017 at 9:54 AM, Michael Han wrote: > > > > &g

Re: shutdown Observer

2017-03-09 Thread Michael Han
The log indicates that your server socket on observer timed out after syncing with leader. It could simply because that the latency between your DCs exceeds the socket timeout configuration ZK uses. The timeout is calculated as tickTime * syncLimit so you might want tweak these values to fit the la

Re: RE: Zookeeper statup issue

2017-03-09 Thread Michael Han
ssue > > > > It stays there forever. The ZK version is 3.4.6. > > We just use the bin/zkServer.sh script to start up ZK. > > It seems not reproducible again. > > Also logged a bug https://issues.apache.org/jira/browse/ZOOKEEPER-2714 > for the issue. > >

Re: Zookeeper statup issue

2017-03-08 Thread Michael Han
Did your ZK server stay in this "not running" state forever - or eventually it's up and serving requests? If it's the later, then this is not a bug, because during start up ZK server has to initialize various sub systems after the server instance is initialized; so if there are client requests comi

Re: Zookeeper Cross Datacenter Cluster

2017-03-06 Thread Michael Han
Back up requires replication which has two types, synchronous and asynchronous. ZooKeeper quorum provides synchronous replication. But as mentioned, 2 DC will not work no matter how. You need at least three (and in general odd numbers - for majority quorum). There are quorum weights and groups tha

Re: etcd performance comparison

2017-02-22 Thread Michael Han
#x27;m more concerned about the fact that I saw a talk yesterday > >>> that > >>>> mentioned both etcd and consul as options for service discovery but > not > >>> ZK. > >>>> That feels like a big hit for our community. Orthogonal to this topic, &g

Re: etcd performance comparison

2017-02-21 Thread Michael Han
Kudus to etcd team for making this blog and thanks for sharing. >> I feel like they're running a questionable configuration. Looks like the test configuration does not have separate dir

Re: ZooKeeper DOS exploit published

2017-02-15 Thread Michael Han
I have a patch for https://issues.apache.org/jira/browse/ZOOKEEPER-2693 (pull request 179 ). Feedback will be highly appreciated. It would be good that we can get this in a few days as it is both a security fix and a blocker for two ongoing releases (3.

Re: are ephemeral nodes removed when client receives session expiration

2017-02-09 Thread Michael Han
e if you find it). Furthermore, > the mark the session closing code I posted only run on the lead as far as I > can see (again, please point me to the code) > > > > Just to repeat, the race is between the learner gets the quorum > closeSession and the client issue a read. No? >

Re: zookeeper and SSL

2017-02-09 Thread Michael Han
Hi Juan, >> I am wondering when 3.5 will become stable release? The current plan is to cut 3.5.3 beta release candidates this month, get it out, have folks tested and used it and iterate and eventually remove the beta tag to reach a stable release of 3.5 that replaces current 3.4. Sorry I don't h

Re: are ephemeral nodes removed when client receives session expiration

2017-02-08 Thread Michael Han
Id); > } > if (secureServerCnxnFactory != null) { > secureServerCnxnFactory.closeSession(sessionId); > } > cnxn.setSessionId(sessionId); > reopenSession(cnxn, sessionId, passwd, sessionTimeout); > } > > > > > > On Feb 7, 2017, at 3:46 PM,

Re: are ephemeral nodes removed when client receives session expiration

2017-02-07 Thread Michael Han
(or in > parallel as the original post seems to indicate). I think I can definitely > simulate this with a test but it will be tricky to make it pass/fail > deterministically so I didn’t try. > > Am I missing something? > > -Ryan > > > > On Feb 7, 2017, at 1:24 PM, Mic

Re: Observers taking a long time to recover after network outage

2017-02-07 Thread Michael Han
>> My expectation was that it would reconnect once the network healed. Right, it is intended to behave like that, but I see there are a couple of cases that it could took longer to recover: * Network condition is not stable after outage - for example the latency is longer than what's configured f

Re: are ephemeral nodes removed when client receives session expiration

2017-02-07 Thread Michael Han
Zhang wrote: > I am a bit confused by the code > > On Jan 25, 2017, at 1:33 PM, Michael Han mailto:hanm > @cloudera.com>> wrote: > > Does ZK guarantee that ephemeral nodes from a client are removed on the > sever by the time the client receives a session expiration e

Re: Extremely different readings on different zookeeper deployments

2017-02-07 Thread Michael Han
+1 on checking the disk set up first. Also, it is good to check the server logs on the Windows 7 boxes to see if there is anything obviously suspicious. In particular we log warnings if flushing the transaction to disk takes longer than a predefined threshold (1ms by default). Meanwhile another ex

Re: are ephemeral nodes removed when client receives session expiration

2017-01-26 Thread Michael Han
txns. However, saying that > ZK does not guarantee a consistent view isn't correct, the view of clients > is always consistent (we guarantee sequential consistency), but they aren't > necessarily the same and they don't necessarily reflect the latest > committed state. > &g

Re: are ephemeral nodes removed when client receives session expiration

2017-01-25 Thread Michael Han
>> If you ask whether the client will see its ephemerals upon creating a new session, then the answer is that it shouldn't because the createSession txn will be ordered necessarily before the closeSession txn, which implies that the client should not see the ephemerals. Second this - so *for the s

Re: are ephemeral nodes removed when client receives session expiration

2017-01-25 Thread Michael Han
>> Does ZK guarantee that ephemeral nodes from a client are removed on the sever by the time the client receives a session expiration event? "the server" is a vague definition, as ZooKeeper ensemble is composed of multiple servers :). >> Therefore, it seems to be possible for a client to connect

Re: Zookeeper data loss scenarios

2017-01-05 Thread Michael Han
I suspect that you might hit ZOOKEEPER-2325 / ZOOKEEPER-261 which could possible cause data loss. Consider this case - we have A, B, C servers but for some reasons A and B got replaced by Ex

Re: jepsen testing

2017-01-05 Thread Michael Han
Forwarding this to user mail list. On Wed, Jan 4, 2017 at 1:56 PM, Charles Allen wrote: > Hi All, > > A few years ago there was a Jepsen test for zookeeper > https://aphyr.com/posts/291-jepsen-zookeeper > > Since then there have been some improvements to zookeeper (3.5 in alpha) > and improvemen

Re: Zookeeper Ensemble Automation

2017-01-05 Thread Michael Han
>> I don’t see any indication when It will jump to beta or even stable. The ZooKeeper community is working on getting a Release Candidate of 3.5.3 beta build very soon (in a matter of weeks). So optimistically speaking we will reach beta very soon and hopefully have a stable release after that. >

Re: Zookeeper communication protocol

2017-01-02 Thread Michael Han
Wire protocol is documented at: https://github.com/apache/zookeeper/blob/master/src/zookeeper.jute There is also a tool to analyze the ZK messages, which might help for your case: https://github.com/twitter/zktraffic On Sun, Jan 1, 2017 at 11:24 PM, Ankit Shah wrote: > Hi, > > I need to debug t

  1   2   >