We are on java 6. Jun
On Mon, Jun 6, 2011 at 12:13 PM, Fournier, Camille F. [Tech] < [email protected]> wrote: > Hey Jun, question: What version of Java are your clients running? I keep > hitting a bug in my java5 test suite and I'm wondering if in fact I am > seeing the same problem you're reporting here. > > C > > -----Original Message----- > From: Jun Rao [mailto:[email protected]] > Sent: Friday, June 03, 2011 12:59 PM > To: [email protected] > Subject: Re: lost ZK events across datacenters > > I don't expect that we can discover the problem right now. However, what > are > the things that I can do to collect enough tracing should the problem occur > again in the future (e.g., is INFO level logging enough)? > > Thanks, > > Jun > > On Fri, Jun 3, 2011 at 9:56 AM, Jun Rao <[email protected]> wrote: > > > The log doesn't have any state changing entries around the time the > watcher > > is triggered, in all clients. > > > > Jun > > > > > > On Fri, Jun 3, 2011 at 9:32 AM, Fournier, Camille F. [Tech] < > > [email protected]> wrote: > > > >> Any state changes for the problem client between setting the watch and > >> when you expected it to get called? Do you have logs for that client vs > the > >> others that show anything? > >> > >> -----Original Message----- > >> From: Jun Rao [mailto:[email protected]] > >> Sent: Friday, June 03, 2011 4:40 AM > >> To: [email protected] > >> Subject: Re: lost ZK events across datacenters > >> > >> Ben, > >> > >> Some details below. > >> > >> The call that sets the watcher simple calls getChildren with watcher > flag > >> set to true. The triggering change is that one of the child nodes (which > >> is > >> ephemeral) is deleted because the creating client is gone. > >> > >> Thanks, > >> > >> Jun > >> > >> On Thu, Jun 2, 2011 at 10:49 AM, Benjamin Reed <[email protected]> > wrote: > >> > >> > can you tell us a bit more about the scenario? what was the call the > >> > set the watch event? and what were the changes that caused the event? > >> > > >> > thanx > >> > ben > >> > > >> > On Wed, Jun 1, 2011 at 3:14 PM, Jun Rao <[email protected]> wrote: > >> > > All my clients were on different machines. 2 of them got the watcher > >> > fired > >> > > about the same time. The third one never got the watcher triggered. > >> > > > >> > > Thanks, > >> > > > >> > > Jun > >> > > > >> > > On Wed, Jun 1, 2011 at 2:18 PM, Fournier, Camille F. [Tech] < > >> > > [email protected]> wrote: > >> > > > >> > >> All clients are in different processes? > >> > >> I've used zkclient and haven't seen any problems, but I haven't > >> hammered > >> > it > >> > >> too hard yet. I took a long look at the code and didn't see any > >> errors > >> > but > >> > >> there could always be something very subtle. > >> > >> > >> > >> -----Original Message----- > >> > >> From: Jun Rao [mailto:[email protected]] > >> > >> Sent: Wednesday, June 01, 2011 4:09 PM > >> > >> To: [email protected] > >> > >> Subject: Re: lost ZK events across datacenters > >> > >> > >> > >> I am using the zkclient package ( > >> > >> https://github.com/sgroschupf/zkclient.git). > >> > >> The watcher code seems reasonable. Basically, each watcher event is > >> > first > >> > >> added to a queue. A separate event thread dequeues each event and > >> reads > >> > the > >> > >> children of a path (which re-registers the watcher) and invokes the > >> > >> registered listener. > >> > >> > >> > >> Anybody knows any issues in zkclient? > >> > >> > >> > >> Thanks, > >> > >> > >> > >> Jun > >> > >> > >> > >> On Wed, Jun 1, 2011 at 12:04 PM, Ted Dunning < > [email protected]> > >> > >> wrote: > >> > >> > >> > >> > This is most commonly due, in my own history of programming > errors, > >> to > >> > >> > writing code that has a race window in it. It is conceivable > that > >> > cross > >> > >> > data-center operation would make such a race more of a problem. > >> > >> > > >> > >> > Can you say a bit about your code? Did you make sure to use > >> standard > >> > >> > idioms > >> > >> > as opposed to setting the watch in a different call from reading > >> the > >> > >> data? > >> > >> > > >> > >> > On Wed, Jun 1, 2011 at 11:40 AM, Jun Rao <[email protected]> > wrote: > >> > >> > > >> > >> > > Hi, > >> > >> > > > >> > >> > > I have a setup where multiple ZK clients are sitting in a > >> different > >> > >> > > datacenter from the ZK server. All clients registered the same > >> child > >> > >> > > watcher > >> > >> > > on a path. However, when the children of the path changed, the > >> > watcher > >> > >> on > >> > >> > 1 > >> > >> > > of the clients didn't fire. This seems to have happened a > couple > >> of > >> > >> times > >> > >> > > to > >> > >> > > me. I am using ZK 3.3.3. Has anyone used ZK in a cross > datacenter > >> > setup > >> > >> > and > >> > >> > > seen problems like that before? > >> > >> > > > >> > >> > > Thanks, > >> > >> > > > >> > >> > > Jun > >> > >> > > > >> > >> > > >> > >> > >> > > > >> > > >> > > > > >
