is_unrecoverable() means exactly that: the session is toast. nothing you
do will get it back.
zookeeper_init is almost never used with a non-null client_id. the main
use case for it is crash recovery. i've rarely seen it used, but you can
start a session, save off the client_id to disk, create
oops, sorry camille, i didn't mean to replicate your answer. you
explained it better than me :)
ben
On 11/18/2010 10:06 AM, Fournier, Camille F. [Tech] wrote:
This is exactly the scenario that you use to test session expiration, make one
connection to a ZK and then another with the same sessi
ah i see. you are manually reestablishing the connection to B using the
session identifier for the session with A.
the problem is that when you call "close" on a session, it kills the
session. we don't really have a way to close a handle without do that.
(actually there is a test class that do
that quote is a bit out of context. it was with respect to a proposed
change.
in your scenario can you explain step 4)? what are you closing?
ben
On 11/18/2010 07:16 AM, Gustavo Niemeyer wrote:
Greetings,
As some of you already know, we've been using ZooKeeper at Canonical
for a project we'v
at 3:45 PM, Benjamin Reed wrote:
it would have to be a TCP based load balancer to work with ZooKeeper
clients, but other than that it should work really well. The clients will be
doing heart beats so the TCP connections will be long lived. The client
library does random connection load balancing an
it would have to be a TCP based load balancer to work with ZooKeeper
clients, but other than that it should work really well. The clients
will be doing heart beats so the TCP connections will be long lived. The
client library does random connection load balancing anyway.
ben
On 11/03/2010 12:
it, and
then trying to create more sequential znodes. I'm guessing this is
pretty well-tested behavior, so there must be something weird or wrong
about the way I have stuff setup.
I'm happy to provide whatever logs or snapshots might help someone
track this down. Thanks,
Jeremy
On 1
how were you able to reproduce it?
all the znodes in /zkrsm were created with the sequence flag. right?
ben
On 11/01/2010 02:28 PM, Jeremy Stribling wrote:
We were able to reproduce it. A "stat" on all three servers looks
identical:
[zk:(CONNECTED) 0] stat /zkrsm
cZxid = 9
ctime = Mon Nov 01
class in Hedwig? Is it used
somewhere for this concurrent read/write problem?
-regards
Amit
- Original Message
From: Benjamin Reed
To: zookeeper-user@hadoop.apache.org
Sent: Fri, 22 October, 2010 11:09:07 AM
Subject: Re: Is it possible to read/write a ledger concurrently
currently program
currently program1 can read and write to an open ledger, but program2
must wait for the ledger to be closed before doing the read. the problem
is that program2 needs to know the last valid entry in the ledger.
(there may be entries that may not yet be valid.) for performance
reasons, only prog
we should put in a test for that. it is certainly a plausible
scenario. in theory it will just flow into the next epoch and everything
will be fine, but we should try it and see.
ben
On 10/19/2010 11:33 AM, Sandy Pratt wrote:
Just as a thought experiment, I was pondering the following:
ZK s
which scheme are you using?
ben
On 10/18/2010 11:57 PM, FANG Yang wrote:
2010/10/19 FANG Yang
hi, all
I have a simple zk client written by c ,which is attachment #1. When i
use ZOO_CREATOR_ALL_ACL, the ret code of zoo_create is -114((Invalid ACL
specified definde in zookeeper.h)), but
we should be exposing those classes and releasing them as a testing
jar. do you want to open up a jira to track this issue?
ben
On 10/18/2010 05:17 AM, Anthony Urso wrote:
Anyone have any pointers on how to test against ZK outside of the
source distribution? All the fun classes (e.g. ClientBa
state. Is my
understanding correct? Please advice.
Thanks
Avinash
On Tue, Oct 12, 2010 at 10:45 AM, Benjamin Reed wrote:
ZooKeeper considers a client dead when it hasn't heard from that client
during the timeout period. clients make sure to communicate with ZooKeeper
at least once in 1/
ZooKeeper considers a client dead when it hasn't heard from that
client during the timeout period. clients make sure to communicate with
ZooKeeper at least once in 1/3 the timeout period. if the client doesn't
hear from ZooKeeper in 2/3 the timeout period, the client will issue a
ConnectionLos
d who
the follower to get more insight?
Thanks
A
On Sun, Oct 10, 2010 at 8:33 AM, Benjamin Reed wrote:
this usually happens when a follower closes its connection to the leader.
it is usually caused by the follower shutting down or failing. you may get
further insight by looking at the follower
this usually happens when a follower closes its connection to the leader. it is
usually caused by the follower shutting down or failing. you may get further
insight by looking at the follower logs. you should really run with timestamps
on so that you can correlate the logs of the leader and foll
eeper used in production as a WAL
(or for any other use) anywhere? If so, for what uses?
Any info (even anecdotal) would be great!
-jake
On Thu, Oct 7, 2010 at 9:15 AM, Benjamin Reed wrote:
hi amit,
sorry for the late response. this week has been crunch time for a lot of
different t
hi amit,
sorry for the late response. this week has been crunch time for a lot of
different things.
here are your answers:
production
1. it is still in prototype phase. we are evaluating different aspects,
but there is still some work to do to make it production ready. we also
need to get
you will need to time how long it takes to read all that state back in
and adjust the initTime accordingly. it will probably take a while to
pull all that data into memory.
ben
On 10/05/2010 11:36 AM, Avinash Lakshman wrote:
I have run it over 5 GB of heap with over 10M znodes. We will defin
we should also point out that our ops guys here at yahoo! don't like
the break at major clause. i imagine when we do the next major release
we will try to be one release backwards compatible. (although we
shouldn't promise it until we successfully do it once :)
ben
On 09/30/2010 10:29 AM, Pa
ah dang, i should have said "generate a close request for the session
and push that through the system."
ben
On 09/10/2010 01:01 PM, Benjamin Reed wrote:
the problem is that followers don't track session timeouts. they track
when they last heard from the sessions that a
x27;t really want to waste my time if there's a fundamental reason it's a bad
idea.
Thanks,
Camille
-----Original Message-
From: Benjamin Reed [mailto:br...@yahoo-inc.com]
Sent: Wednesday, September 08, 2010 4:03 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: closing session
@hadoop.apache.org
Cc: Benjamin Reed
Subject: Re: closing session on socket close vs waiting for timeout
This really is, just as Ben says a problem of false positives and false
negatives in detecting session
expiration.
On the other hand, the current algorithm isn't really using all the
information avai
this session type (so
4 would fail). Would that address your concern, others?
Patrick
On 09/01/2010 10:03 AM, Benjamin Reed wrote:
i'm a bit skeptical that this is going to work out properly. a server
may receive a socket reset even though the client is still alive:
1) client sends a re
i'm a bit skeptical that this is going to work out properly. a server
may receive a socket reset even though the client is still alive:
1) client sends a request to a server
2) client is partitioned from the server
3) server starts trying to send response
4) client reconnects to a different serv
, Ted Dunning wrote:
Put in a four letter command that will put the server to sleep for 15
seconds!
:-)
On Thu, Aug 19, 2010 at 3:51 PM, Benjamin Reed wrote:
i'm updating ZOOKEEPER-366 with this discussion and try to get a patch out.
Qing (or anyone else, can you reproduce it pretty easily?)
me.
On Thu, Aug 19, 2010 at 9:19 AM, Benjamin Reed wrote:
yes, you are right. we could do this. it turns out that the expiration code
is very simple:
while (running) {
currentTime = System.currentTimeMillis();
if (nextExpirationTime>
if we can't rely on the clock, we cannot say things like "if ... for 5
seconds".
also, clients connect to servers, not visa-versa, so we cannot say
things like "server can attempt to reconnect".
ben
On 08/19/2010 10:17 AM, Vishal K wrote:
Hi Ted,
I haven't give it a serious thought yet, bu
could be given a bit of a second lease
on life, delaying
all of their expiration. Since time-outs are relatively short, the server
would be able to forget
about the bump very shortly.
On Thu, Aug 19, 2010 at 8:22 AM, Benjamin Reed wrote:
if we try to use network messages to detect and corre
i'm afraid it isn't that simple. we figure out who is expired by
bucketizing sessions to be expired in an interval. if we hear from that
a we move it to a different bucket, otherwise when the bucket expires,
everything in that bucket goes away.
when time jumps, it looks to the server like ther
do you have a pointer to those timers?
thanx
ben
On 08/18/2010 11:58 PM, Martin Waite wrote:
On Linux, I believe that there is a class of timers
provided that is immune to this, but I doubt that there is a platform
independent way of coping with this.
there are two things to keep in mind when thinking about this issue:
1) if a zk client is disconnected from the cluster, the client is
essentially in limbo. because the client cannot talk to a server it
cannot know if its session is still alive. it also cannot close its session.
2) the client
the client does keep track of the watches that it has outstanding. when
it reconnects to a new server it tells the server what it is watching
for and the last view of the system that it had.
ben
On 08/16/2010 09:28 AM, Qian Ye wrote:
thx for explaination. Since the watcher can be preserved wh
good point ted! i should have waited a bit longer before responding :)
ben
On 08/16/2010 09:20 AM, Ted Dunning wrote:
There are two different concepts. One is connection loss. Watchers survive
this and the client automatically connects
to another member of the ZK cluster.
The other is sessio
zookeeper takes care of reregistering all watchers on reconnect. you
don't need to do anything.
ben
On 08/16/2010 09:04 AM, Qian Ye wrote:
Hi all:
Will the watchers of a client be losed when the client disconnects from a
Zookeeper server? It is said at
http://hadoop.apache.org/zookeeper/docs/
i thought there was a jira about supporting embedded zookeeper. (i
remember rejecting a patch to fix it. one of the problems is that we
have a couple of places that do System.exit().) i can't seem to find it
though.
one case that would be great for embedding is writing test cases, so i
think
as long as a watcher object is only used with a single ZooKeeper object
it will be called by the same thread.
ben
On 07/21/2010 11:12 AM, Joshua Ball wrote:
Hi,
Do implementations of Watcher need to be thread-safe, or can I assume
that process(...) will always be called by the same thread?
T
i did a benchmark a while back to see the effect of turning off the
disk. (it wasn't as big as you would think.) i had to modify the code.
there is an option to turn off the sync in the config that will get you
most of the performance you would get by turning off the disk entirely.
ben
On 07/
it is still guaranteed to see its own write. when a client reconnects to
a different server, we guarantee that the new server will be at least as
up-to-date as the last server. otherwise the client would go back in
time and a lot of things would go wrong.
ben
On 07/20/2010 08:28 AM, Jun Rao w
) then you need to synchronize.
ben
On 07/19/2010 03:36 PM, Srikanth Bondalapati: wrote:
Thanks Dave& Ben.
So, ultimately I need to synchronize process() method, when the same Watcher
object is registered with different zookeeper handles (or Znodes). :)
On Mon, Jul 19, 2010 at 3:03 PM, Benj
you have concluded correctly.
1) bookkeeper was designed for a process to use as a write-ahead log, so
as a simplifying assumption we assume a single writer to a log. we
should be throwing an exception if you try to write to a handle that you
obtained using openLedger. can you open a jira for
yes, you (and dave) are correct. watches are invoked sequentially in
order. the only time you can run into trouble is if you register the
same watcher object with different zookeeper handles since there is a
dispatch thread per zookeeper handle.
ben
On 07/19/2010 02:50 PM, Srikanth Bondalapat
how big is your database? it would be good to know the timing of the two calls.
shutdown should take very little time.
sent from my droid
-Original Message-
From: Vishal K [vishalm...@gmail.com]
Received: 7/16/10 6:31 PM
To: zookeeper-user@hadoop.apache.org [zookeeper-u...@hadoop.apache.
i think there is a wiki page on this, but for the short answer:
the number of znodes impact two things: memory footprint and recovery
time. there is a base overhead to znodes to store its path, pointers to
the data, pointers to the acl, etc. i believe that is around 100 bytes.
you cant just di
by custom QuorumVerifier are you referring to
http://hadoop.apache.org/zookeeper/docs/r3.3.1/zookeeperHierarchicalQuorums.html
?
ben
On 07/14/2010 12:43 PM, Sergei Babovich wrote:
Hi,
We are currently evaluating use of ZK in our infrastructure. In our
setup we have a set of servers running fr
that means that your connection to zookeeper has broken. usually because
the server you were connected to failed.
see http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling
ben
On 07/14/2010 11:41 AM, Avinash Lakshman wrote:
Hi All
I run into this periodically. I am curious to know what this
ted is correct, as usual. that warning is really to avoid unnecessary
load, and 16 clients really don't generate much of a load at all. even
with thousands of cliets, if they really need the list of children it
will still be ok. the point of that note was that for leader election
only one proce
can you try the following:
Index: src/contrib/fatjar/build.xml
===
--- src/contrib/fatjar/build.xml(revision 962637)
+++ src/contrib/fatjar/build.xml(working copy)
@@ -46,6 +46,7 @@
+
thanx
ben
On 07/09/2010
the difference between close and disconnect is that close will actually
try to tell the server to kill the session before disconnecting.
a paranoid lock implementation doesn't need to test it's session. it
should just monitor watch events to look for disconnect and expired
events. if a client
watchers are executed sequentially and in order. there is one dispatch
thread that invokes the watch callbacks.
ben
ps - in 2) you do not install a watch.
On 06/29/2010 06:13 AM, André Oriani wrote:
Hi,
Are Watchers executed sequentially or in parallel ? Suppose I want to
monitor the childr
we do this in our tests for ZooKeeper. bookkeeper uses the testing
classes as well, unfortunately, we haven't documented the interface.
ben
On 06/22/2010 08:42 PM, Ishaaq Chandy wrote:
Hi all,
First some background:
1. We use maven as our build tool.
2. We use Hudson as our CI server, it is s
yes. (except for the single threaded C-client library :)
ben
On 06/17/2010 10:16 AM, Jun Rao wrote:
Hi,
Is ZK client thread safe? Is it ok for multiple threads sharing the same ZK
client? Thanks,
Jun
the call is executed at a later time on a different thread. the zoo_a*
calls are non-blocking, so (subject to the thread scheduling) usually
they will return before the request completes.
ben
On 06/03/2010 01:24 PM, Jack Orenstein wrote:
I'm trying to figure out how to use zookeeper's C API.
charity, do you mind going through your scenario again to give a
timeline for the failure? i'm a bit confused as to what happened.
ben
On 06/02/2010 01:32 PM, Charity Majors wrote:
Thanks. That worked for me. I'm a little confused about why it threw the
entire cluster into an unusable state
get SSL in.
On 05/26/2010 04:44 PM, Mahadev Konar wrote:
Hi Vishal,
Ben (Benjamin Reed) has been working on a netty based client server
protocol in ZooKeeper. I think there is an open jira for it. My network
connection is pretty slow so am finding it hard to search for it.
We have been
good catch lei! if this helps gregory, can you open a jira to throw an
exception in this situation. we should be throwing an invalid argument
exception or something in this case.
thanx
ben
On 05/20/2010 09:04 AM, Lei Zhang wrote:
Seems you are passing in wrong arguments:
Should have been:
is this a bug? shouldn't we be returning an error.
ben
On 05/12/2010 11:34 AM, Patrick Hunt wrote:
I think that explains it then - the server is probably dropping the new
(3.3.0) "getChildren" message (xid 7) as it (3.2.2 server) doesn't know
about that message type. Then the server responds to
i agree with ted. i think he points out some disadvantages with trying
do do more. there is a slippery slope with these kinds of things. the
implementation is complicated enough even with the simple model that we use.
ben
On 03/29/2010 08:34 PM, Ted Dunning wrote:
I perhaps should not have sa
awesome! that would be great ivan. i'm sure pat has some more concrete
suggestions, but one simple thing to do is to run the unit tests and
look at the log messages that get output. there are a couple of
categories of things that need to be fixed (this is in no way exhaustive):
1) messages tha
yes it means in sync with the leader. syncLimit governs the timeout when
a follower is actively following a leader. initLimit is the initial
connection timeout. because there is the potential for more data that
needs to be transmitted during the initial connection, we want to be
able to manage
we have updated ZOOKEEPER-713 with much more detail, but the bottom line
is that the Invalid snapshot was caused by an OutOfMemoryError. this
turns out not be a problem since we recover using an older snapshot.
there are other things that are happening that are the real causes of
the problem. s
weird, this does sound like a bug. do you have a reliable way of
reproducing the problem?
thanx
ben
On 03/16/2010 08:27 AM, Łukasz Osipiuk wrote:
nope.
I always pass 0 as clientid.
Łukasz
On Tue, Mar 16, 2010 at 16:20, Benjamin Reed wrote:
do you ever use zookeeper_init() with the
do you ever use zookeeper_init() with the clientid field set to
something other than null?
ben
On 03/16/2010 07:43 AM, Łukasz Osipiuk wrote:
Hi everyone!
I am writing to this group because recently we are getting some
strange errors with our production zookeeper setup.
From time to time we
it is a bit confusing but initLimit is the timer that is used when a
follower connects to a leader. there may be some state transfers
involved to bring the follower up to speed so we need to be able to
allow a little extra time for the initial connection.
after that we use syncLimit to figure
no, you cannot watch for ACL changes. it is one of the
API/implementation simplifications we did since we didn't have a good
use case for it.
it does seem a little bit weird. we are following file system semantics
here. i guess for ultimate security only clients with admin permission
would be
just to expand on mahadev's answer a little bit: the basic guarantee is
that you will see the watch event before you see the change. so let's
say you call getChildren( "/foo", w, acb, ctx) twice and while you do
that another client creates a child of /foo. there are three scenarios:
1) the cre
i was looking through the docs to see if we talk about handling session
expired, but i couldn't find anything. we should probably open a jira to add to
the docs, unless i missed something. did i?
ben
-Original Message-
From: Mahadev Konar [mailto:maha...@yahoo-inc.com]
Sent: Monday, Fe
i second ted's proposals! thanx ted.
there is one other option. when you create the ZooKeeper object you can
pass a session id and password. your bounced server can actually
reattach to the session. (that is why we put that constructor in.) to
use it you need to save the session id and passwor
sadly connectionloss is the really ugly part of zookeeper! it is a pain
to deal with. i'm not sure we have best practice, but i can tell you
what i do :) ZOOKEEPER-22 is meant to alleviate this problem.
i usually use the asynch API when handling the watch callback. in the
completion function i
there aren't any dependencies on jboss. can you clarify the dependency
that you are seeing?
thanx
ben
Gustavo Niemeyer wrote:
Hello there,
Is the dependency on JBoss a hard one, or is there a way to not use
it? Perhaps an alternative package providing the same interface?
I'm trying to get i
henry is correct. just to state another way, Zab guarantees that if a
quorum of servers have accepted a transaction, the transaction will
commit. this means that if less than a quorum of servers have accepted a
transaction, we can commit or discard. the only constraint we have in
choosing is or
hi Qing,
i'm glad you like the page and Zab.
yes, we are very familiar with Paxos. that page is meant to show a
weakness of Paxos and a design point for Zab. it is not to say Paxos is
not useful. Paxos is used in the real world in production systems.
sometimes there are not order dependencies
no please open a jira as a new feature request.
sent from my droid
-Original Message-
From: Steve Chu [stv...@gmail.com]
Received: 12/21/09 3:44 AM
To: zookeeper-user@hadoop.apache.org [zookeeper-u...@hadoop.apache.org]
Subject: Does zookeeper support listening on a specified address?
H
I agree with Ted, it doesn't seem like a good idea to do in practice.
however, you do have a couple of options if you are just testing things:
1) use tmpfs
2) you can set forceSync to "no" in the configuration file to disable
syncing to disk before acknowledging responses
3) if you really want
there aren't any limits on the number of znodes, it's just limited by
your memory. there are two things (probably more :) to keep in mind:
1) the 1M limit also applies to the children list. you can't grow the
list of children to more than 1M (the sum of the names of all of the
children) otherw
there are a bunch of presentations you can grab at
http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations
ben
Mark Vigeant wrote:
Hey Everyone,
I'm supposed to give a presentation next week about the basic functionality and
uses of zookeeper. I was wondering if anybody out there had:
david, it should be pretty easy to do since we do it in our test cases.
(start and stop servers.) the problem is that we haven't really exposed
the interfaces. (but we have wanted to.) and we don't have tests for
those non-existent exposed interfaces :) with a clean interface it
should be prett
right at the beginning of
http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperStarted.html it
shows you the minimum standalone configuration.
that doesn't explain the 0 id. i'd like to try an reproduce it. do you
have an empty data directory with a single file, myid, set to 1?
ben
Leona
so you have two problems going on. both have the same root:
zookeeper_init returns before a connection and session is established
with zookeeper, so you will not be able to fill in myid until a
connection is made. you can do something with a mutex in the watcher to
wait for a connection, or you
is some details im not getting here :-)
Regards, Orjan
On Fri, Sep 25, 2009 at 3:56 PM, Benjamin Reed
wrote:
can you clarify what you are asking for? are you just looking for
motivation? or are you trying to find out how to use it?
the myid file just has the unique identifier (number) of
can you clarify what you are asking for? are you just looking for
motivation? or are you trying to find out how to use it?
the myid file just has the unique identifier (number) of the server in
the cluster. that number is matched against the id in the configuration
file. there isn't much to sa
oh yes, that is scenario that may generate a connection refused.
ben
Ted Dunning wrote:
Good points.
On the other hand, it could still be firewall issues.
On Wed, Sep 23, 2009 at 8:30 AM, Benjamin Reed wrote:
The "connection refused" message as opposed to no route to host,
The "connection refused" message as opposed to no route to host, or
unknown host, indicate that zookeeper has not been started on the other
machines. are the other machines giving similar errors?
ben
Le Zhou wrote:
Hi,
I'm trying to install HBase 0.20.0 in fully distributed mode on my cluster
what error do you get?
ben
Todd Greenwood wrote:
I'm attempting to secure a zookeeper installation using zookeeper ACLs.
However, I'm finding that while Ids.OPEN_ACL_UNSAFE works great, my
attempts at using Ids.CREATOR_ALL_ACL are failing. Here's a code
snippet:
public class ZooWrapper
{
/*
these suggestions would be great to put in a faq!
thanx ted
ben
Ted Dunning wrote:
I always used a large node for ZK to avoid sharing the machine, but the
reason for doing that turned out to be incorrect. In fact, my problem was
to do with GC on the client side.
I can't believe that they are
are you using the single threaded or multithreaded C library? the exceeded
deadline message means that our thread was supposed to get control after a
certain period, but we got control that many milliseconds late. what is your
session timeout?
ben
From:
good point david! zhang can you try david's scripts? we should probably
commit those. thanx for pointing them out david.
ben
David Bosschaert wrote:
FWIW, I've uploaded some Windows versions of the zookeeper scripts to
https://issues.apache.org/jira/browse/ZOOKEEPER-426 a while ago. They
run f
: Re: exist return true before event comes in
Interesting, that basically means if I want strict order, I have to
use the async api?
~~~
Hadoop training and consulting
http://www.scaleunlimited.com
http://www.101tec.com
On Aug 3, 2009, at 8:10 PM, Benjamin Reed wrote
I assume you are calling the synchronous version of exists. The callbacks for
both the watches and async calls are processed by a callback thread, so the
ordering is strict. Synchronous call responses are not queued to the callback
thread. (this allows you to make synchronous calls in callbacks
Or maybe /usr/local/include/zookeeper but either way c-client-src is weird.
Please open a jira.
Thanx
ben
Sent from my phone.
-Original Message-
From: Michi Mutsuzaki
Sent: Saturday, August 01, 2009 6:15 PM
To: zookeeper-user@hadoop.apache.org
Subject: c client header location
Hello
the processing of the write transaction is described in the zookeeper
internals presentation on
http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations i think
other presentations may also touch on it. we also have it in the
ZooKeeper documentation:
http://hadoop.apache.org/zookeeper/do
ty concerns as
well?
Sorry for all the questions, just trying to get the story straight so
that we don't spread misinformation to HBase users. Most users start
out on very small clusters, so dedicated ZK nodes are not a realistic
assumption... How big of a deal is that?
JG
Benjamin R
we designed zk to have high performance so that it can be shared by
multiple applications. the main thing is that you use dedicated zk
machines (with a dedicated disk for logging). once you have that in
place, watch the load on your cluster, as long as you aren't saturating
the cluster you shou
the create is atomic. we just use a data structure that does not store
the list of children in order.
ben
Erik Holstad wrote:
Hey Patrik!
Thanks for the reply.
I understand all the reasons that you posted above and totally agree that
nodes should not be sorted since you then have to pay that o
sorry to jump in late.
if i understand the scenario correctly, you are partitioned from ZK, but
you still have access to the NN on which you are holding leases to
files. the problem is that even though your ephemeral nodes may timeout,
you are still holding a lease on the NN and recovery would
hat, I'll open a jira and give it a try.
J-D
On Tue, Jun 23, 2009 at 6:04 PM, Benjamin Reed wrote:
ZooKeeper only tells you about states that it is sure about, so you will
not get the Expired event until you reconnect to ZooKeeper. if you never
connect again to ZooKeeper, you will not ge
ZooKeeper only tells you about states that it is sure about, so you will
not get the Expired event until you reconnect to ZooKeeper. if you never
connect again to ZooKeeper, you will not get the Expired event. if you
want to timeout using some sanity value, 2 times the session timeout for
examp
We have discovered that there is a bug in ZooDefs.PERMS.ALL: it is
missing ZooDefs.PERMS.ADMIN, thus it isn't really ALL :) The problem is
that the C binding includes ADMIN in ALL, so we have an inconsistency
between the two bindings. We would like to fix this as a bug fix in the
next release,
just to clarify i believe you are talking about callbacks on the watch
object you are passing in the asynchronous call rather than the
asynchronous completion callback. (Henry is making the same assumption.)
when you say you are getting the callback 10 times, i believe your are
talking about 10
1 - 100 of 155 matches
Mail list logo