Re: [curator][lock-recipe] implementing locks with just ephemeral nodes

2020-02-06 Thread Ted Dunning
Also, if you are doing 100 million per day, then with a bit of peak to valley ratio, you are doing 5000 per second. That is pushing limits. If your ZK has other loads or is on substandard hardware somehow. If you are storing things in a database, don't you have an create or fail operation (AKA

Re: ZooKeeper Meetup @ Facebook is 1 Week Away! (January 29th 2020)

2020-01-28 Thread Ted Dunning
ks, > > Jeelani > > On 1/22/20, 12:57 PM, "Ted Dunning" wrote: > > Mohamed, > > Could you please answer the question about whether there will be an NDA > required at the door? > > > > On Wed, Jan 22, 2020 at 12:51 PM Mohamed Jeelan

Re: ZooKeeper Meetup @ Facebook is 1 Week Away! (January 29th 2020)

2020-01-22 Thread Ted Dunning
Mohamed, Could you please answer the question about whether there will be an NDA required at the door? On Wed, Jan 22, 2020 at 12:51 PM Mohamed Jeelani wrote: > ZooKeeper Meetup @ Facebook is just 1 week away! (January 29th 2020) - > Please RSVP if you haven’t yet at -

Re: Leader election and leader operation based on zookeeper

2019-09-21 Thread Ted Dunning
WHat I suggested is almost exactly what Jordan suggested. I should have read the rest of the thread before posting. On Sat, Sep 21, 2019 at 9:54 AM Ted Dunning wrote: > > I would suggest that using an epoch number stored in ZK might be helpful. > Every operation that the master ta

Re: Leader election and leader operation based on zookeeper

2019-09-21 Thread Ted Dunning
I would suggest that using an epoch number stored in ZK might be helpful. Every operation that the master takes could be made conditional on the epoch number using a multi-transaction. Unfortunately, as you say, you have to have the update of the epoch be atomic with becoming leader. The natural

Re: create or setData in transaction?

2019-08-15 Thread Ted Dunning
a fair lock implementation, you are have better completion guarantees. Does this do what you want? On Wed, Aug 14, 2019 at 11:04 PM Zili Chen wrote: > Thanks for your explanation Michael and Ted :-) > > Best, > tison. > > > Ted Dunning 于2019年8月15日周四 下午1:22写道: > >

Re: create or setData in transaction?

2019-08-14 Thread Ted Dunning
nd where is the "hard" > comes from? > > Best, > tison. > > > Ted Dunning 于2019年8月15日周四 上午9:40写道: > > > The multi op is atomic (all other operations will be before or after teh > > multi), consistent (all viewers will see all the effects or none, and

Re: create or setData in transaction?

2019-08-14 Thread Ted Dunning
The multi op is atomic (all other operations will be before or after teh multi), consistent (all viewers will see all the effects or none, and durable (because ZK is linearized anyway). That leaves isolated which is kind of hard to talk about with ZK since all operations are fast and sequential.

Re: create or setData in transaction?

2019-08-14 Thread Ted Dunning
The multi operation *is* like a transaction. All the operations will succeed or none. On Wed, Aug 14, 2019 at 9:33 AM Andor Molnar wrote: > "it's said that ZooKeeper has a transaction mechanism” > > I’m still confused with this. ZooKeeper doesn’t support transactions to my > best knowledge.

Re: concept question for zookeeper

2018-07-15 Thread Ted Dunning
Check here for theory https://blog.acolyer.org/2015/03/10/zookeepers-atomic-broadcast-protocol-theory-and-practice/ And here https://www.semanticscholar.org/paper/Zab%3A-High-performance-broadcast-for-primary-backup-Junqueira-Reed/b02c6b00bd5dbdbd951fddb00b906c82fa80f0b3 On Sun, Jul 15, 2018,

Re: zookeeper cluster

2014-06-12 Thread Ted Dunning
The most common error that causes this is to not set up the myid files correctly. On Thu, Jun 12, 2014 at 4:47 PM, Cameron McKenzie mckenzie@gmail.com wrote: This is not correct, 3 is a minimum for redundancy. If 1 goes down, the other 2 can still form a quorum (as there are more than

Re: zookeeper needs 3 nodes to support 1 failure

2014-06-10 Thread Ted Dunning
mardi 10 juin 2014, Ted Dunning ted.dunn...@gmail.com a écrit : On Tue, Jun 10, 2014 at 9:24 AM, Olivier Mallassi olivier.malla...@gmail.com javascript:; wrote: What are the guarantees zookeeper provide and a data grid does not provide? the lock? a sequential consistency? You said

Re: Distributed Applocation using Zookeeper

2014-06-07 Thread Ted Dunning
Also, if you have to create several state flags in ZK (perhaps you have to choose several IP addresses, for instance) you can use the multi update capability. It is almost never necessary to use an actual lock recipe with ZK. Almost always what you really want is some kind of first come, first

Re: zookeeper watch limitation

2014-06-06 Thread Ted Dunning
It is a problem if you expect subsequent watches to go out in milliseconds. It isn't a problem if the resulting delays are OK with you. To me, it sounds like it will be just fine. If the herd effect is too much, you can always split the version flags into many pieces and update one version flag

Re: Partitioned Zookeeper

2014-05-19 Thread Ted Dunning
On Sun, May 18, 2014 at 10:38 PM, Pramod Biligiri pramodbilig...@gmail.comwrote: Do you see any other problems with the approach I'm taking. If we see any gains from it, we can look at the tricky issues next. No. The idea of partitioning ZK has come up often and something roughly like what

Re: Partitioned Zookeeper

2014-05-18 Thread Ted Dunning
Pramod, Have you looked at the multi command? That might cause you some serious heartburn. On Sun, May 18, 2014 at 8:25 PM, Pramod Biligiri pramodbilig...@gmail.comwrote: Hi, [Let me know if you want this thread moved to the Dev list (or even to JIRA). I was only seeing automated mails

Re: Partitioned Zookeeper

2014-05-18 Thread Ted Dunning
On Sun, May 18, 2014 at 9:36 PM, Pramod Biligiri pramodbilig...@gmail.comwrote: I guess you mean that you can't parallellize the workload because a multi command might require locking all the containers? Let me know if I'm missing something. Right. Getting that to work cleanly could be

Re: Restarting leader zookeeper instance made quorum lost

2014-04-09 Thread Ted Dunning
Your email is a little ambiguous. a) 5 instances with 3 as quorum could mean 5 instances configured and running normally. Or b) it could mean 5 instances with 2 instances that are down. In (a) restarting the leader instance *should* cause the cluster to do a leader election again and form a

Re: Restarting leader zookeeper instance made quorum lost

2014-04-09 Thread Ted Dunning
, but it should be hard to detect any outage. Thank you Best, Jae On Wed, Apr 9, 2014 at 12:06 PM, Ted Dunning ted.dunn...@gmail.com wrote: Your email is a little ambiguous. a) 5 instances with 3 as quorum could mean 5 instances configured and running normally. Or b) it could

Re: Best pratice

2014-03-21 Thread Ted Dunning
: Interesting. Especially since at work we've been leaning towards using bcache for performance reasons to be able to deal with the flood of input we get - not so much for ZooKeeper but for Cassandra. Do you have any opinions about bcache? http://bcache.evilpiepirate.org/ On 21 March 2014 07:14, Ted

Re: Best pratice

2014-03-20 Thread Ted Dunning
Test first without anything all that fancy gear. For all of the applications that you mention except Kafka, the actual transaction rate is near nil. A dedicated single spindle per ZK is likely to be be luxurious accommodation. I doubt seriously that you need SSD's. On Thu, Mar 20, 2014 at

Re: Best pratice

2014-03-20 Thread Ted Dunning
SSD's also have the issue that it is common that recently written data is not actually persisted. Worse, new data might be persisted while slightly older data is not. These issues differ greatly across different hardware. Disks with write caching disabled are vastly better understood. On

Re: fsync latency causing zookeeper server to fall apart

2014-02-04 Thread Ted Dunning
The best way to mitigate this is to fix your platform problem. There is very little that ZK can do to help. IF it takes 10 seconds to flush to disk, this is going to make these machines pretty intolerable for all kinds of applications. To specifically answer your question, yes, you can increase

Re: contrib REST?

2014-01-08 Thread Ted Dunning
If you are going to try to put a web-ish interface on ZK, look at web sockets so that you can get notifications. On Wed, Jan 8, 2014 at 11:46 AM, Jordan Zimmerman jor...@jordanzimmerman.com wrote: My focus is langs that have mediocre or no ZK support (i.e. Ruby). I’m thinking of adding a

Re: Ensure there is one master

2013-11-27 Thread Ted Dunning
@gmail.comwrote: Excuse my ignorance (I'm relatively new to ZK), but how does the accuracy of the clock affect this situation? On Wed, Nov 27, 2013 at 11:53 AM, Ted Dunning ted.dunn...@gmail.com wrote: This is not necessarily true. The old master may not have an accurate clock

Re: Ensure there is one master

2013-11-26 Thread Ted Dunning
This is not necessarily true. The old master may not have an accurate clock. The ascending id idea that Alex mentions is a very nice way to put more guarantees on the process. On Tue, Nov 26, 2013 at 2:58 PM, Alexander Shraer shra...@gmail.com wrote: Cameron's solution basically relies on

Re: ZK ensemble resused or dedicated?

2013-10-28 Thread Ted Dunning
For production settings with well-behaved applications, sharing is not a bad idea. I would definitely isolate development efforts onto a dev instance of ZK. And if you have trigger happy admins who think rebooting fixes all ills, I would consider separating apps. How is it that you wound up

Re: adding a separate thread to detect network timeouts faster

2013-09-10 Thread Ted Dunning
those types of failures is considered significant for some people. Are there technical reasons that would prevent this idea from working? On 09/10/2013 01:31 PM, Ted Dunning wrote: I don't see the strong value here. A few failures would be detected more quickly, but I am not convinced

Re: Zookeeper performance

2013-07-31 Thread Ted Dunning
Generally, ZK is much better as a coordination layer. Starting with an expected transaction load well above the normal limits of operation is not a grand idea. Much better to do something simpler like have ZK coordinate shard masters that each use conventional methods for handling transactions

Re: [Release 3.5.0] Any news yet?

2013-07-09 Thread Ted Dunning
+1 on moving to JDK 7 I have been through this decision a few times now. The first time I was -1 (eventually -0), the second time +0 and lately I am +1 on this change. ZK probably can't move to require JDK7, but 7 really is better. On Tue, Jul 9, 2013 at 9:09 AM, Patrick Hunt

Re: Is it a good idea to embed ZK in a product application ?

2013-07-02 Thread Ted Dunning
, a lot of big application have that). Within one cluster, it can done by enable/disable quorum. On Mon, Jul 1, 2013 at 8:37 PM, Ted Dunning ted.dunn...@gmail.com wrote: Embedding zk in a product is a fine thing. Embedding zk in an application is (in my estimation) a bad idea. The problem

Re: Is it a good idea to embed ZK in a product application ?

2013-07-01 Thread Ted Dunning
Embedding zk in a product is a fine thing. Embedding zk in an application is (in my estimation) a bad idea. The problem is that you generally want to coordinate all the way down to zero, especially in upgrade situations. Combining the uptime constraints of zk with those of your app is bad

Re: ZooKeeper survives 4 nodes, dies with 3

2013-06-11 Thread Ted Dunning
This sounds like you have a bad configuration for your cluster. On Tue, Jun 11, 2013 at 2:48 PM, NoamBC n...@cloudon.com wrote: Hi, We use ZK 3.3 in production, a quorum of 5 nodes, and found an odd behaviour where if we take one node offline, the quorum keeps serving requests, but if

Re: Consistency in zookeeper

2013-03-01 Thread Ted Dunning
That is always true due to pesky things like propagation delay. The idea of before and after are very slippery unless you have and use primitives like sync to force boundaries. On Fri, Mar 1, 2013 at 1:38 PM, Yasin yasinceli...@gmail.com wrote: So, if the read request is made by some other

Re: Consistency in zookeeper

2013-03-01 Thread Ted Dunning
Yes. Sync doesn't guarantee up to date. It guarantees an ordering. It guarantees that if event A involves a ZK update and if you can guarantee that A occurs before sync, then any read on a client C that is done after a sync on C will see a successor state of A. On Fri, Mar 1, 2013 at 2:01 PM,

Re: High availability backend services via zookeeper or TCP load balancer

2013-02-26 Thread Ted Dunning
I think that Camille's points are well founded. However, I have just had two years experience with the opposite approach in which we use ZK as a way of doing application level load balancing and I really have come to like it. The situation is that the job tracker in MapR is a Hadoop Job Tracker

Re: Getting confused with the recipe for lock

2013-01-14 Thread Ted Dunning
Yes. And in general, you can't have precise distributed lock control. There will always be a bit of slop. So decide which penalty is easier to pay. Do you want at-most-one or at-least-one or something in between? You can't have exactly-one and still deal with expected problems like partition

Re: Znodes are really wait-free objects?

2012-12-10 Thread Ted Dunning
Wait-free means that from the user's point of view, all operations will proceed or fail immediately. Synchronization internal to the server to ensure coherent memory access doesn't really apply here. The point is that ZK is not a lock manager and there is no corollary to a wait operation. This

Re: determining zookeeper capacity requirements

2012-12-07 Thread Ted Dunning
On Fri, Dec 7, 2012 at 6:50 PM, Ian Kallen spidaman.l...@gmail.com wrote: No swap, we keep swap turned off following the better to OOM than slow down theory. Good theory except that it may compromise getting cores. I'm gonna try to get GC's plotted for all of these JVMs. I'm also finding

Re: very uneven distribution of clients to servers...

2012-12-06 Thread Ted Dunning
Next test would be to incorporate a small patch to seed the shuffling of server names by the client ID or something. On Thu, Dec 6, 2012 at 12:32 PM, Brian Tarbox briantar...@gmail.com wrote: I killed all the clients and then restarted them do its not a reconnection issue .

Re: determining zookeeper capacity requirements

2012-12-06 Thread Ted Dunning
Is there any swap activity on the client or server boxes? On Thu, Dec 6, 2012 at 8:25 PM, Ian Kallen spidaman.l...@gmail.com wrote: d) ZK swapping out due to inactivity during memory pressure? Can you cite an explanation or explain this here? I'm not sure what to look for. It wouldn't be

Re: very uneven distribution of clients to servers...

2012-12-05 Thread Ted Dunning
Kishore, That should be a good explanation, but it depends on where the returning node gets put into the replication chain. Most replicating systems get put at the end of a replication chain since that causes the least disruption. I don't know what ZK does, but this can be tested by determining

Re: very uneven distribution of clients to servers...

2012-12-05 Thread Ted Dunning
Shuffle depends on Math.random which is seeded by time of start. That should be just fine. On Wed, Dec 5, 2012 at 11:35 PM, Camille Fournier cami...@apache.orgwrote: If you're using the Java ZooKeeper client, you can see in the code that the way connections are established is that we parse

Re: determining zookeeper capacity requirements

2012-12-05 Thread Ted Dunning
THis looks like very low load. What is the rate of change on znodes (i.e. what is the desired watch signal rate)? On Wed, Dec 5, 2012 at 10:10 PM, Ian Kallen spidaman.l...@gmail.com wrote: We have an ensemble of three servers and have observed varying latencies, watches that seemingly don't

Re: Network partition and ephemeral nodes

2012-11-25 Thread Ted Dunning
the lock any more. On Nov 25, 2012, at 11:40 AM, Ted Dunning ted.dunn...@gmail.com wrote: To expand on this, when a client is disconnected, it receives a disconnect event. From that time until the client is reconnected, the client doesn't know what has happened. The session may have

Re: Zookeeper connect to a wrong leader

2012-11-21 Thread Ted Dunning
Why do you say leader and follower? Normally all members of a quorum can take on any role. On Tue, Nov 20, 2012 at 11:38 PM, Maru Panda song...@gmail.com wrote: Hi guys we have 1 zookeeper cluster On server A, there're 2 zookeepers, port: 2180 ( follower ) and 2181 ( leader ) another one

Re: 'Move' operation for Zookeeper

2012-11-15 Thread Ted Dunning
In particular, you can do something like this contents, version = get(x/y/z) multi([delete(x/y/z, version), create(a/b/c, contents)]) The multi will fail if x/y/z has been updated between the get and the multi. The multi will also fail if a/b/c cannot be created. If the multi succeeds,

Re: question about data replication

2012-11-06 Thread Ted Dunning
You specify the MINIMUM number of copies when you define the number of nodes in your ZK cluster. The idea is that ZK requires strong consistency and provides guarantees to that effect. The only way to provide those guarantees is if a majority of the ZK cluster agree to and persist all changes.

Re: long time full gc

2012-10-28 Thread Ted Dunning
form server. So I think that need to takes a few milliseconds to receive session expried events from full gc finish, that's right? 2012/10/29 Ted Dunning ted.dunn...@gmail.com Yes. When the GC finishes, it will get the disconnect and session expiration event. You can experiment

Re: Prune Txn Log different dataLog and dataDir

2012-10-26 Thread Ted Dunning
You don't have Cloudera Zookeeper. You may have Apache Zookeeper that is packaged with Cloudera's software distribution. On Fri, Oct 26, 2012 at 8:31 AM, Roshan Punnoose rpunno...@texeltek.comwrote: I have my cloud era zookeeper (3.3.5) running with the dataLogDir and the dataDir going to two

Re: Some thing is wrong

2012-10-17 Thread Ted Dunning
Using 6 nodes for ZK is a bit odd. Actually, it is a bit even. If all of the nodes are involved in the quorum, you will get lower write throughput than with 5 nodes and slightly higher chance of failure since it is more likely to get 3/6 node failures versus 3/5 failures. What motivated your

Re: Some thing is wrong

2012-10-17 Thread Ted Dunning
Ahh. Better. On Wed, Oct 17, 2012 at 6:29 PM, yang.li yang...@baifendian.com wrote: Thank you for your advicse, Ted. Actually, the sixth node is set to observer mode, we just put it there to stand by.

Re: Memory leaks in zoo_multi API

2012-10-12 Thread Ted Dunning
Can you provide sample code and more detailed replication instructions? On Fri, Oct 12, 2012 at 6:06 PM, Deepak Jagtap deepak.jag...@maxta.comwrote: Sure, just reported this on jira! Thanks Regards, Deepak On Fri, Oct 12, 2012 at 1:01 PM, Michi Mutsuzaki mi...@cs.stanford.edu wrote:

Re: EC2 disk configuration for zookeeper? One big disk or two smaller ones per node?

2012-10-09 Thread Ted Dunning
Two disks could be a significant advantage. You should also experiment with ways to avoid VM induced gaps in time. Finally, if you really are going to be write heavy, 3 nodes are likely to perform better than 5. On Tue, Oct 9, 2012 at 2:03 PM, Brian Tarbox briantar...@gmail.com wrote: In

Re: EC2 disk configuration for zookeeper? One big disk or two smaller ones per node?

2012-10-09 Thread Ted Dunning
I have some experience with these. On Tue, Oct 9, 2012 at 4:47 PM, Eric Fleischman e...@ericfleischman.comwrote: * Using high IOPS instances Haven't tried this. * Using multiple disks and doing OS level RAID across them to present a single logical disk to the app This typically is very

Re: zookeeper on SSD

2012-10-03 Thread Ted Dunning
Yes. And Patrick's experience is not unexpected. There is, however, a huge variation with different types of flash memory. The software driving the flash can also result in very different experience. The experiences that he alludes to are likely with a conventional SSD packaging of flash

Re: zookeeper on SSD

2012-10-03 Thread Ted Dunning
AFAIK) to engineer out latency spikes. I'd imagine they started with a strong vendor and not a low end device, but of course this is just speculation. On Thu, Oct 4, 2012 at 12:28 PM, Ted Dunning ted.dunn...@gmail.com wrote: Yes. And Patrick's experience is not unexpected

Re: Millions of opened connections to Zookeeper cluster possible?

2012-09-17 Thread Ted Dunning
Another option is to use a proxy. On Mon, Sep 17, 2012 at 6:44 PM, Morris Bernstein mor...@systems-deployment.com wrote: The obvious solution would be running multiple Zookeepers hierarchically. Each server would take, say, 1000 clients. Each server would be a client of an uber Zookeeper.

Re: 2 server cluster?

2012-09-11 Thread Ted Dunning
Well... you can set one of the 2 as an observer. That way you can have a quorum of one of 2, but you can't have a quorum of the other of two. Probably not what you want. You can also set up a third member who refuses to become leader. They basically serve as a tiebreaker. You can lighten the

Re: Session refused, zxid too high

2012-09-03 Thread Ted Dunning
client or something else? (memory corruption on client possible?) Patrick On Fri, Aug 31, 2012 at 12:56 PM, Ted Dunning ted.dunn...@gmail.com wrote: But isn't the larger ZXID pretty stunningly large? The epoch number is nearly 2 billion and the transaction id is 845 million

Re: Session refused, zxid too high

2012-08-31 Thread Ted Dunning
But isn't the larger ZXID pretty stunningly large? The epoch number is nearly 2 billion and the transaction id is 845 million. These seem implausible from a starting point of 0. On Fri, Aug 31, 2012 at 12:54 PM, Patrick Hunt ph...@apache.org wrote: seen zxid 0x636c65616e2d7374 our last zxid

Re: ANN: Exhibitor 1.1.0

2012-06-27 Thread Ted Dunning
That link seems to point to the what is exhibitor page. Can you explain how this feature is not fantastically dangerous? Do you have long time constants? How do you avoid obvious crazy cases like 5 new hosts jumping onto a cluster where all of the original hosts in the quorum are down for

Re: Adding to a quorum

2012-06-18 Thread Ted Dunning
Yes. Updating and restarting works fine. One does wonder why you have *four* servers now. That generally doesn't provide any advantage over having three servers. On Mon, Jun 18, 2012 at 10:28 AM, David Nickerson davidnickerson4mailingli...@gmail.com wrote: Say I have a quorum of four

Re: Adding to a quorum

2012-06-18 Thread Ted dunning
, 2012 at 2:35 PM, Jordan Zimmerman jzimmer...@netflix.comwrote: FYI - have a look at our Exhibitor system which makes upgrades/additions easier to manage. https://github.com/Netflix/exhibitor On 6/18/12 11:26 AM, Ted Dunning ted.dunn...@gmail.com wrote: Yes. Updating and restarting

Re: Multi question on using previous ops results

2012-05-18 Thread Ted Dunning
?] Another way to do what I would like would be to create a znode structure that is not attached to the existing one at all, then once I have it the way it should be, attach it to an existing node. However, I am quite certain this is not possible On 5/16/2012 7:21 PM, Ted Dunning wrote

Re: Multi question on using previous ops results

2012-05-16 Thread Ted Dunning
It looks like you are assuming that the versions of two znodes are synchronized. That seems pretty dangerous. What is the higher level intent here? Would it better to simply build a multi to update both znodes? Sent from my iPhone On May 16, 2012, at 11:59 AM, Joe Gamache

Re: zooKeeper connectionloss on a folder

2012-05-01 Thread Ted Dunning
. (com.netflix.curator.framework.recipes.queue.DistributedQueue) But feeling it is a problem of ZooKeeper. Thank you. Yuhan On Mon, Apr 30, 2012 at 10:50 PM, Ted Dunning ted.dunn...@gmail.com wrote: How many objects are in that directory? Sent from my iPhone On Apr 30, 2012, at 8:24 PM, Yuhan Zhang yzh

Re: Zookeeper server recovery behaviors

2012-04-19 Thread Ted Dunning
The client can't think it has succeeded with a deletion if it is connected to the minority side of a partitioned cluster. To think that, the commit would have to be be ack'ed by a majority which by definition can't happen either because the master is in the minority and can't get a majority or

Re: How to replace a zookeeper server ?

2012-04-18 Thread Ted Dunning
As long as the old quorum constitutes a quorum in the new cluster you should be fine. In the replace a dead node scenario you can reconfigure the survivors before bringing up the new node. That only works if you start with 5 nodes, but being down a node in a three node cluster is a problem

Re: Zookeeper on short lived VMs and ZOOKEEPER-107

2012-03-15 Thread Ted Dunning
Alexander's comment still applies. VM's can function or go away completely, but they can also malfunction in more subtle ways such that they just go VY slowly. You have to account for that failure mode. These failures can even be transient. This would probably break your approach. On

Re: Zookeeper on short lived VMs and ZOOKEEPER-107

2012-03-15 Thread Ted Dunning
point to a new IP (and hence cause the old server to be replaced). Did I understand your objection correctly? Von: ext Ted Dunning [ted.dunn...@gmail.com] Gesendet: Donnerstag, 15. März 2012 19:50 Bis: user@zookeeper.apache.org Cc: shra...@gmail.com

Re: Possibility / consequences of having multiple elected leaders

2012-03-08 Thread Ted Dunning
-inc.com wrote: I’ve been wondering about this for a while, and suspect that this check doesn’t exist in the code… but I may be wrong. From: Ted Dunning [mailto:ted.dunn...@gmail.com] Sent: Wednesday, March 07, 2012 4:55 PM To: Alexander Shraer Cc: user@zookeeper.apache.org Subject: Re

Re: Possibility / consequences of having multiple elected leaders

2012-03-08 Thread Ted Dunning
** ** *From:* Ted Dunning [mailto:ted.dunn...@gmail.com] *Sent:* Thursday, March 08, 2012 12:32 AM *To:* Alexander Shraer *Cc:* user@zookeeper.apache.org *Subject:* Re: Possibility / consequences of having multiple elected leaders ** ** The whole point of the zab protocol is to ensure

Re: Unable to restart ZK

2012-03-08 Thread Ted Dunning
Can you resolve the pronouns here? It looks like ZK is running, but that Finagle will not. Is that what you meant to say? The log messages make it look like somebody assumes that a directory /twitter/servers does exist, but is finding that this directory doesn't exist. Isn't that an

Re: Unable to restart ZK

2012-03-08 Thread Ted Dunning
://screencast.com/t/TWG529FGV0R Finagle added the paths to ZK, now when I try to restart, it looks like ZK is trying to replay some transaction, failing, and quitting. If I knew where the data file was, I could just delete the data file and start fresh. -Ryan On 3/8/12 4:03 PM, Ted Dunning

Re: Rolling upgrades

2012-03-08 Thread Ted Dunning
It won't be any different than a temporary state when one of 3 or 5 nodes is down. On Thu, Mar 8, 2012 at 4:10 PM, Jordan Zimmerman jzimmer...@netflix.comwrote: AlsoŠ I thought that ZK ensembles need to be odd in number. How would ZK handle a temporary state where there is an even number?

Re: Rolling upgrades

2012-03-08 Thread Ted Dunning
Any even number that is greater than half of the configured number of nodes is fine. The only *really* bad even number of servers is 0. On Thu, Mar 8, 2012 at 4:24 PM, Ted Dunning ted.dunn...@gmail.com wrote: It won't be any different than a temporary state when one of 3 or 5 nodes is down

Re: Possibility / consequences of having multiple elected leaders

2012-03-07 Thread Ted Dunning
This can be emulated on Linux by simply pausing the process. The correct behavior is that the old leader will freeze and if it comes back relatively soon, it will still be recognized as leader. If the pause is long enough, then the other members of the quorum will decide that they have lost

Re: Possibility / consequences of having multiple elected leaders

2012-03-07 Thread Ted Dunning
Not off the cuff and I have to run away right now. On Wed, Mar 7, 2012 at 4:07 PM, Alexander Shraer shra...@yahoo-inc.comwrote: Such a commit will be rejected due to an old epoch. Ted, can you please point me to the place in the code where this check is performed ? Thanks a lot, Alex

Re: Thread leak problem

2012-03-06 Thread Ted Dunning
If you were able to connect to a 1 out of 3 servers, then you have a very serious problem since that isn't enough machines to form a quorum. I suspect you have a configuration error. On Tue, Mar 6, 2012 at 9:08 AM, Patrick Hunt ph...@apache.org wrote: This is windows I take it? What version of

Re: Zookeeper cluster failing

2012-03-06 Thread Ted Dunning
OK. Trying again against the correct thread. I trimmed down your summary to something that jumped out at me. If you have three ZK servers in a cluster, then taking 2 down should not allow normal operation. I think you have a config error. On Tue, Mar 6, 2012 at 8:15 AM, Scott Lindner

Re: session watches

2012-03-06 Thread Ted Dunning
Look at the Curator library. On Tue, Mar 6, 2012 at 9:50 AM, Shelley, Ryan ryan.shel...@disney.comwrote: Ok, I'll go back and see if I can find the Leader Election examples and rethink the problem a bit more. Thanks for the valuable insights. I truly appreciate it. -Ryan There are

Re: session watches

2012-03-05 Thread Ted Dunning
Yes ^ 4 On Mon, Mar 5, 2012 at 3:59 PM, Shelley, Ryan ryan.shel...@disney.comwrote: If I'm correct after reading the docs, if my client sets a watch on a znode, disconnects, then automatically reconnects, any appropriate pending watches will be fired. I'm assuming this is only as long as the

Re: session watches

2012-03-05 Thread Ted Dunning
On Mon, Mar 5, 2012 at 3:59 PM, Shelley, Ryan ryan.shel...@disney.comwrote: Additionally, I'm curious if there is a way to know if there are any watches currently on a znode. Since we can't do ephemeral parent nodes, yet, and if I have a client that fails (which results in losing their

Re: session watches

2012-03-05 Thread Ted Dunning
Ryan, Lots of assumptions here. On Mon, Mar 5, 2012 at 5:02 PM, Shelley, Ryan ryan.shel...@disney.comwrote: The reason I was thinking that it might be useful to know if there are any watches on a node is for the lack of ephemeral parent nodes. If a node doesn't have a watch on it, I can

Re: Getting data after the watch

2012-03-02 Thread Ted Dunning
I don't think that your test tests what you think it tests. You won't necessarily get a notification every time a value changes. If a value changes and a watch is queued, you won't get another notification until you get the data and set the new watch. If another change happens before you reset

Re: Getting data after the watch

2012-03-02 Thread Ted Dunning
On Fri, Mar 2, 2012 at 3:33 PM, Amirhossein Kiani amirhki...@gmail.comwrote: Oh.. you are right. The value was skipped because I was getting two of some values (newer but not older) So I should restructure my code to not make decisions that rely on locks created based on node data (unless I

Re: Create nested paths

2012-03-01 Thread Ted Dunning
. On 2/29/12 7:16 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Wed, Feb 29, 2012 at 7:04 PM, Marshall McMullen marshall.mcmul...@gmail.com wrote: Yes, Ted's right. The multi has to fail as that's part of the contract it guarantees. The only thing you could do, which will significantly

Re: Create nested paths

2012-03-01 Thread Ted Dunning
(which, even if it did work, the CheckResult object doesn't include the path, so I'm just hoping the results are in the same order as the operations were listed in, as I have to infer the path I'll need to build from the position of the OpResult in the list). On 3/1/12 3:05 PM, Ted Dunning

Re: Create nested paths

2012-02-29 Thread Ted Dunning
Well, in 3.4, you can use multi to do this. On Wed, Feb 29, 2012 at 4:08 PM, Shelley, Ryan ryan.shel...@disney.comwrote: Is it possible to create all nodes in a path in one step? For example, my ZK is empty of any nodes. I want to create: /lorem/ipsum/foo/bar Do I have to create each

Re: Create nested paths

2012-02-29 Thread Ted Dunning
at 4:26 PM, Shelley, Ryan ryan.shel...@disney.comwrote: Can you reference some docs? I looked but couldn't find anything regarding multi. On 2/29/12 4:23 PM, Ted Dunning ted.dunn...@gmail.com wrote: Well, in 3.4, you can use multi to do this. On Wed, Feb 29, 2012 at 4:08 PM, Shelley, Ryan

Re: Create nested paths

2012-02-29 Thread Ted Dunning
On Wed, Feb 29, 2012 at 7:04 PM, Marshall McMullen marshall.mcmul...@gmail.com wrote: Yes, Ted's right. The multi has to fail as that's part of the contract it guarantees. The only thing you could do, which will significantly narrow the race condition, is as you're *building *the multi,

Re: znode defaults

2012-02-21 Thread Ted Dunning
ZK doesn't do this out of the box, but it is very easy to wrap up ZK primitives to get this behavior. On Wed, Feb 22, 2012 at 12:45 AM, Shelley, Ryan ryan.shel...@disney.comwrote: Hi folks. Very new to ZooKeeper and evaluating it for a project. The ZK directory-like layout is perfect for what

Re: sync vs. async vs. multi performances

2012-02-14 Thread Ted Dunning
These results are about what is expected although the might be a little more extreme. I doubt very much that hbase is mutating zk nodes fast enough for this to matter much. Sent from my iPhone On Feb 14, 2012, at 8:00, N Keywal nkey...@gmail.com wrote: Hi, I've done a test with

Re: sync vs. async vs. multi performances

2012-02-14 Thread Ted Dunning
status written in . On paper, if we need 0,02s per node, that makes it to the minute to recover, just for zookeeper. That's theory. I haven't done a precise measurement yet. Anyway, if ZooKeeper can be faster, it's always very interesting :-) Cheers, N. On Tue, Feb 14, 2012 at 8:00 PM, Ted

Re: sync vs. async vs. multi performances

2012-02-14 Thread Ted Dunning
Yes it is possible. With a loaded server, each group of transactions wiill take about one rotation. But the time from when they arrived to the time that they are committed will be roughly 0 ... 8 ms for a 7200 RPM drive because the transactions will be arriving at different times. There will be

Re: Deployment planning question

2012-02-03 Thread Ted Dunning
On Fri, Feb 3, 2012 at 4:01 PM, Jason Harmon jh0...@att.com wrote: My preference, of course, would be to have three datacenters...if one is partitioned off, zookeeper would not respond, which would be perfect. In that scenario, our other two would still be up and running in the other two

Re: QuorumPeer requires at least 2 peers?

2012-02-02 Thread Ted Dunning
One of the required functions of Zookeeper is that network partition should not result in inconsistent results. If you have 2 servers, then partition leaves a symmetrical situation that you have to break in order to have a reasonable way to continue. Since you can't, in general, distinguish

Re: Configuration management based on load

2012-01-30 Thread Ted Dunning
I am completely clueless about what you are asking here. Can you be a bit more explicit? - do you mean that you want to increase the number of threads in the listeners? - do you mean that you want to increase the number of listeners? - what do you mean by load? Load on Zookeeper? Load on

Re: Backups

2012-01-19 Thread Ted Dunning
A backup can still be useful. It is a common property that a database backup is known to be slightly out of date. Such a backup can still be very useful. In many systems, the most common cause of error is simple human intervention. This especially applies to file systems and databases, but can

  1   2   3   >