On 30 August 2012 17:06, Ben Bangert <[email protected]> wrote: > On Aug 30, 2012, at 4:50 PM, Michi Mutsuzaki <[email protected]> > wrote: > > > I don't think there is an official spec beyond the zookeeper.jute > > file. It would be very helpful if you can share what you have found > > implementing your python client. > > The most interesting thing I found was that documented C/Java behavior > isn't actually enforced beyond the client boundaries. For some reason I had > expected the state transitions listed to be some limitation of the > Zookeeper protocol itself. So it has had me curious about ways the client > implementation itself could make using Zookeeper less error-prone, as > Curator and Kazoo respectively try to make working with Zookeeper less > error-prone in their languages. This is part of why I was baffled when an > AUTH_FAILED resulted in connection termination, but not session expiration. > > In the case of kazoo, its now substantially easier to debug what is > actually happening since its always within the bounds of Python. > Reinstalling debug builds of the C lib and Python C binding got hairy, and > still obfuscated a lot. This made debugging the prior password mangling > issue in the Python lib a major pain. > > I've found that the C lib seems like a bit of a second-class citizen for > Zookeeper... the read-only feature has had patches submitted since 2010, > which still aren't applied. The Python C binding also has multiple > patches... not applied. We applied several of these to fix memory leaks and > the password mangling to our zc-zookeeper-static Python lib, but after > looking through all the remaining C lib bugs/patches and Python C > bugs/patches, its way more than we want to deal with. C isn't my forte, and > I'd much much rather debug Java code or read the Java client if I'm curious > about how something is done, rather than deal with 2 layers of C code in > addition to TCP and the Java server. > > Meanwhile, I can implement the new Zookeeper 3.4 methods after looking at > the jute code in just minutes, and debugging a problem is trivial when its > all Python code. Some people have mentioned that without the C os thread > used by the C binding, its possible in heavy Python threading thrashes that > a ping might not be sent... which is true, but the session timeout can be > increased and thats a very small price to pay given the other things noted > here with the C lib/binding. > > And of course, for other runtimes (Pypy) that can't run C extensions, the > pure Python kazoo will now work. >
FWIW, this is my only reservation about a pure Python client - there isn't a spec, and three separate implementations that might have subtly different behaviours can be a nightmare to maintain. Ben - if you're able to turn any of your efforts towards documenting your observations about how the protocol actually works, that would be awesome. And as regards the unapplied Python patches - that's my bad, I should be committing them much more often. Can you give me a list of those you've found useful, and in return for your excellent work I'll get them committed as soon as I can? > > > I think this is expected. ZooKeeper should not expire a session > > because of authentication failure. That would make it easier for a > > malicious client to expire random sessions. I don't know if there is a > > technical reason for dropping the connection though. Maybe it was an > > arbitrary decision? > > Yea, I wasn't terribly sure. The doc state diagram seems to indicate that > an AUTH_FAILED should be treated the same as a SESSION_EXPIRATION or > CLOSING event. However, your session is dead in the latter two cases, while > the session is *not* dead in AUTH_FAILED, yet you end up in the same state. > I would not be surprised if a substantial amount of code assumed that the > session was dead when an AUTH_FAILED occurred, yet the session is not dead > at all. > > Cheers, > Ben -- Henry Robinson Software Engineer Cloudera 415-994-6679
