Re: the error

2010-03-31 Thread Ted Dunning
As I pointed out in my response, you should distinguish hard and soft failures. If one machine fails even catastrophically, you can provide a new machine to replace it, thus converting a hard failure into a soft one. The conclusion is the same. Three machines is vastly better than one or two. O

Re: the error

2010-03-31 Thread Ted Dunning
Suppose a machine has probability of soft-failure p_1 and catastrophic p_2 << p_1. Assume that two machines have independent failure modes. Probably of soft failure of a one machine cluster = p_1, two machine cluster = probability of soft failure of 1 or 2 machines + probability of one machine ha

Re: the error

2010-03-31 Thread Henry Robinson
Using two machines running ZK will actually decrease your reliability >> compared to using a single machine. Consider using one machine or three. >> > > ? > > Not meaning to pull the thread off-topic, but I don't understand why this > should be the case. Can you elaborate? > > With majority-based

Re: the error

2010-03-31 Thread David Rosenstrauch
On 03/31/2010 02:10 PM, Ted Dunning wrote: To add to Patrick's comments, I hope you mean that you are connecting to ZK from a cluster of two machines rather than having only two machines that form a ZK cluster. Using two machines running ZK will actually decrease your reliability compared to usi

Re: the error

2010-03-31 Thread Ted Dunning
To add to Patrick's comments, I hope you mean that you are connecting to ZK from a cluster of two machines rather than having only two machines that form a ZK cluster. Using two machines running ZK will actually decrease your reliability compared to using a single machine. Consider using one mach

Re: Re: Re: Re: How to ensure trasaction create-and-update

2010-03-31 Thread Ted Dunning
That is one of the great virtues in working with ZK... in the event of a server failure, you get behavior as good as can be expected. There are several failure scenarios: a) a (small) fraction of the ZK servers fail or are cut off, but a quorum persists b) a (large) fraction of the ZK servers fa

Re: the error

2010-03-31 Thread Patrick Hunt
Hi Li, when you say 17 threads reading a znode, do you mean that you have 17 threads each creating a session and using that session to read a znode? If so it's probably due to this: http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_advancedConfiguration see the parameter "

the error

2010-03-31 Thread li li
Dear developer, when I use the zookeeper in a cluster of two machine.I find problem when I make the over 17 threads read data in one znode. I will get the err info as follows.I will be greatly appreciated if you can tell the reason why this happy. 2010-03-31 17:46:59,718 - WARN [main-SendThrea

Re: Re: Re: Re: How to ensure trasaction create-and-update

2010-03-31 Thread zd.wbh
Ted, your suggested flow guaranteed the update sequence to succeed or fail completely. It is under the assumption that zookeeper requester is stable enough. what if a server restart occur in the update sequence, no abort or proceed action can be done. I'm just curious how to handle this kinds of

[ANN] Eclipse GIT plugin beta version released

2010-03-31 Thread Thomas Koch
GIT is one of the most popular distributed version control system. In the hope, that more Java developers may want to explore the world of easy branching, merging and patch management, I'd like to inform you, that a beta version of the upcoming Eclipse GIT plugin is available: http://www.infoq.