curator handles sasl connection
https://issues.apache.org/jira/browse/KAFKA-1695

On Wed, Feb 4, 2015, at 06:10 AM, Jaikiran Pai wrote:
> FWIW - the ZkClient project team have merged the pull request that I had 
> submitted to allow for timeouts to operations 
> https://github.com/sgroschupf/zkclient/pull/29. I heard from Johannes 
> (from the ZkClient project team) that they don't have any specific 
> release date in mind but are willing to release a new version if/when we 
> need one.
> 
> -Jaikiran
> 
> On Wednesday 04 February 2015 12:33 AM, Gwen Shapira wrote:
> > So I think the current plan is:
> > 1. Add timeout in zkclient
> > 2. Ask zkclient to release new version (we need it for few other things too)
> > 3. Rebase on new zkclient
> > 4. Fix this jira and the few others than were waiting for the new zkclient
> >
> > Does that make sense?
> >
> > Gwen
> >
> > On Mon, Feb 2, 2015 at 8:33 PM, Jaikiran Pai <jai.forums2...@gmail.com> 
> > wrote:
> >> I just heard back from Stefan, who manages the ZkClient repo and he seems 
> >> to
> >> be open to have these changes be part of ZkClient project. I'll be creating
> >> a pull request for that project to have it reviewed and merged. Although I
> >> haven't heard of exact release plans, Stefan's reply did indicate that the
> >> project could be released after this change is merged.
> >>
> >> -Jaikiran
> >>
> >> On Tuesday 03 February 2015 09:03 AM, Jaikiran Pai wrote:
> >>> Thanks for pointing to that repo!
> >>>
> >>> I just had a look at it and it appears that the project isn't much active
> >>> (going by the lack of activity). The latest contribution is from Gwen and
> >>> that was around 3 months back. I haven't found release plans for that
> >>> project or a place to ask about it (filing an issue doesn't seem right to
> >>> ask this question). So I'll get in touch with the repo owner and see what
> >>> his plans for the project are.
> >>>
> >>> -Jaikiran
> >>>
> >>> On Monday 02 February 2015 11:33 PM, Gwen Shapira wrote:
> >>>> I did!
> >>>>
> >>>> Thanks for clarifying :)
> >>>>
> >>>> The client that is part of Zookeeper itself actually does support
> >>>> timeouts.
> >>>>
> >>>> On Mon, Feb 2, 2015 at 9:54 AM, Guozhang Wang <wangg...@gmail.com> wrote:
> >>>>> Hi Jaikiran,
> >>>>>
> >>>>> I think Gwen was talking about contributing to ZkClient project:
> >>>>>
> >>>>> https://github.com/sgroschupf/zkclient
> >>>>>
> >>>>> Guozhang
> >>>>>
> >>>>>
> >>>>> On Sun, Feb 1, 2015 at 5:30 AM, Jaikiran Pai <jai.forums2...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi Gwen,
> >>>>>>
> >>>>>> Yes, the KafkaZkClient is a wrapper around ZkClient and not a complete
> >>>>>> replacement.
> >>>>>>
> >>>>>> As for contributing to Zookeeper, yes that indeed in on my mind, but I
> >>>>>> haven't yet had a chance to really look deeper into Zookeeper or get in
> >>>>>> touch with their dev team to try and explain this potential improvement
> >>>>>> to
> >>>>>> them. I have no objection to contributing this or something similar to
> >>>>>> Zookeeper directly. I think I should be able to bring this up in the
> >>>>>> Zookeeper dev forum, sometime soon in the next few weekends.
> >>>>>>
> >>>>>> -Jaikiran
> >>>>>>
> >>>>>>
> >>>>>> On Sunday 01 February 2015 11:40 AM, Gwen Shapira wrote:
> >>>>>>
> >>>>>>> It looks like the new KafkaZkClient is a wrapper around ZkClient, but
> >>>>>>> not a replacement. Did I get it right?
> >>>>>>>
> >>>>>>> I think a wrapper for ZkClient can be useful - for example KAFKA-1664
> >>>>>>> can also use one.
> >>>>>>>
> >>>>>>> However, I'm wondering why not contribute the fix directly to ZKClient
> >>>>>>> project and ask for a release that contains the fix?
> >>>>>>> This will benefit other users of the project who may also need a
> >>>>>>> timeout (thats pretty basic...)
> >>>>>>>
> >>>>>>> As an alternative, if we don't want to collaborate with ZKClient for
> >>>>>>> some reason, forking the project into Kafka will probably give us more
> >>>>>>> control than wrappers and without much downside.
> >>>>>>>
> >>>>>>> Just a thought.
> >>>>>>>
> >>>>>>> Gwen
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sat, Jan 31, 2015 at 6:32 AM, Jaikiran Pai
> >>>>>>> <jai.forums2...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Neha, Ewen (and others), my initial attempt to solve this is uploaded
> >>>>>>>> here
> >>>>>>>> https://reviews.apache.org/r/30477/. It solves the shutdown problem
> >>>>>>>> and
> >>>>>>>> now
> >>>>>>>> the server shuts down even when Zookeeper has gone down before the
> >>>>>>>> Kafka
> >>>>>>>> server.
> >>>>>>>>
> >>>>>>>> I went with the approach of introducing a custom (enhanced) ZkClient
> >>>>>>>> which
> >>>>>>>> for now allows time outs to be optionally specified for certain
> >>>>>>>> operations.
> >>>>>>>> I intentionally haven't forced the use of this new KafkaZkClient all
> >>>>>>>> over
> >>>>>>>> the code and instead for now have just used it in the KafkaServer.
> >>>>>>>>
> >>>>>>>> Does this patch look like something worth using?
> >>>>>>>>
> >>>>>>>> -Jaikiran
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thursday 29 January 2015 10:41 PM, Neha Narkhede wrote:
> >>>>>>>>
> >>>>>>>>> Ewen is right. ZkClient APIs are blocking and the right fix for this
> >>>>>>>>> seems
> >>>>>>>>> to be patching ZkClient. At some point, if we find ourselves
> >>>>>>>>> fiddling
> >>>>>>>>> too
> >>>>>>>>> much with ZkClient, it wouldn't hurt to write our own little
> >>>>>>>>> zookeeper
> >>>>>>>>> client wrapper.
> >>>>>>>>>
> >>>>>>>>> On Thu, Jan 29, 2015 at 12:57 AM, Ewen Cheslack-Postava
> >>>>>>>>> <e...@confluent.io>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>    Looks like a bug to me -- the underlying ZK library wraps a lot 
> >>>>>>>>> of
> >>>>>>>>>> blocking
> >>>>>>>>>> method implementations with waitUntilConnected() calls without any
> >>>>>>>>>> timeouts. Ideally we could just add a version of
> >>>>>>>>>> ZkUtils.getController()
> >>>>>>>>>> with a timeout, but I don't see an easy way to accomplish that with
> >>>>>>>>>> ZkClient.
> >>>>>>>>>>
> >>>>>>>>>> There's at least one other call to ZkUtils besides the one in the
> >>>>>>>>>> stacktrace you gave that would cause the same issue, possibly more
> >>>>>>>>>> that
> >>>>>>>>>> aren't directly called in that method. One ugly solution would be
> >>>>>>>>>> to
> >>>>>>>>>> use
> >>>>>>>>>> an
> >>>>>>>>>> extra thread during shutdown to trigger timeouts, but I'd imagine
> >>>>>>>>>> we
> >>>>>>>>>> probably have other threads that could end up blocking in similar
> >>>>>>>>>> ways.
> >>>>>>>>>>
> >>>>>>>>>> I filed https://issues.apache.org/jira/browse/KAFKA-1907 to track
> >>>>>>>>>> the
> >>>>>>>>>> issue.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Jan 26, 2015 at 6:35 AM, Jaikiran Pai <
> >>>>>>>>>> jai.forums2...@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>    The main culprit is this thread which goes into "forever retry
> >>>>>>>>>>> connection
> >>>>>>>>>>> to a closed zookeeper" when I shutdown Kafka (via a Ctrl + C)
> >>>>>>>>>>> after
> >>>>>>>>>>> zookeeper has already been shutdown. I have attached the complete
> >>>>>>>>>>> thread
> >>>>>>>>>>> dump, but I don't know if it will be delivered to the mailing
> >>>>>>>>>>> list.
> >>>>>>>>>>>
> >>>>>>>>>>> "Thread-2" prio=10 tid=0xb3305000 nid=0x4758 waiting on condition
> >>>>>>>>>>> [0x6ad69000]
> >>>>>>>>>>>        java.lang.Thread.State: TIMED_WAITING (parking)
> >>>>>>>>>>>         at sun.misc.Unsafe.park(Native Method)
> >>>>>>>>>>>         - parking to wait for  <0x70a93368> (a
> >>>>>>>>>>> java.util.concurrent.locks.
> >>>>>>>>>>> AbstractQueuedSynchronizer$ConditionObject)
> >>>>>>>>>>>         at java.util.concurrent.locks.LockSupport.parkUntil(
> >>>>>>>>>>> LockSupport.java:267)
> >>>>>>>>>>>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> >>>>>>>>>>> ConditionObject.awaitUntil(AbstractQueuedSynchronizer.java:2130)
> >>>>>>>>>>>         at
> >>>>>>>>>>> org.I0Itec.zkclient.ZkClient.waitForKeeperState(ZkClient.java:636)
> >>>>>>>>>>>         at
> >>>>>>>>>>> org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:619)
> >>>>>>>>>>>         at
> >>>>>>>>>>> org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:615)
> >>>>>>>>>>>         at
> >>>>>>>>>>>
> >>>>>>>>>> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:679)
> >>>>>>>>>>
> >>>>>>>>>>>         at 
> >>>>>>>>>>> org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> >>>>>>>>>>>         at 
> >>>>>>>>>>> org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> >>>>>>>>>>>         at
> >>>>>>>>>>> kafka.utils.ZkUtils$.readDataMaybeNull(ZkUtils.scala:456)
> >>>>>>>>>>>         at kafka.utils.ZkUtils$.getController(ZkUtils.scala:65)
> >>>>>>>>>>>         at kafka.server.KafkaServer.kafka$server$KafkaServer$$
> >>>>>>>>>>> controlledShutdown(KafkaServer.scala:194)
> >>>>>>>>>>>         at kafka.server.KafkaServer$$anonfun$shutdown$1.apply$mcV$
> >>>>>>>>>>> sp(KafkaServer.scala:269)
> >>>>>>>>>>>         at kafka.utils.Utils$.swallow(Utils.scala:172)
> >>>>>>>>>>>         at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
> >>>>>>>>>>>         at kafka.utils.Utils$.swallowWarn(Utils.scala:45)
> >>>>>>>>>>>         at kafka.utils.Logging$class.swallow(Logging.scala:94)
> >>>>>>>>>>>         at kafka.utils.Utils$.swallow(Utils.scala:45)
> >>>>>>>>>>>         at 
> >>>>>>>>>>> kafka.server.KafkaServer.shutdown(KafkaServer.scala:269)
> >>>>>>>>>>>         at kafka.server.KafkaServerStartable.shutdown(
> >>>>>>>>>>> KafkaServerStartable.scala:42)
> >>>>>>>>>>>         at kafka.Kafka$$anon$1.run(Kafka.scala:42)
> >>>>>>>>>>>
> >>>>>>>>>>> -Jaikiran
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Monday 26 January 2015 05:46 AM, Neha Narkhede wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>    For a clean shutdown, the broker tries to talk to the 
> >>>>>>>>>>> controller
> >>>>>>>>>>> and
> >>>>>>>>>>> also
> >>>>>>>>>>> issues reads to zookeeper. Possibly that is where it tries to
> >>>>>>>>>>>> reconnect
> >>>>>>>>>>>>
> >>>>>>>>>>> to
> >>>>>>>>>>> zk. It will help to look at the thread dump.
> >>>>>>>>>>>> Thanks
> >>>>>>>>>>>> Neha
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Jan 23, 2015 at 8:53 PM, Jaikiran Pai <
> >>>>>>>>>>>> jai.forums2...@gmail.com
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>      I was just playing around with the RC2 of 0.8.2 and noticed
> >>>>>>>>>>>> that
> >>>>>>>>>>>> if I
> >>>>>>>>>>>>
> >>>>>>>>>>>>> shutdown zookeeper first I can't shutdown Kafka server at all
> >>>>>>>>>>>>> since
> >>>>>>>>>>>>> it
> >>>>>>>>>>>>> goes
> >>>>>>>>>>>>> into a never ending attempt to reconnect with zookeeper. I had
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>> kill
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>> Kafka process to stop it. I tried it against trunk too and there
> >>>>>>>>>>>>> too I
> >>>>>>>>>>>>> see
> >>>>>>>>>>>>> the same issue. Should I file a JIRA for this and see if I can
> >>>>>>>>>>>>> come
> >>>>>>>>>>>>> up
> >>>>>>>>>>>>> with
> >>>>>>>>>>>>> a patch?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> FWIW, here's the unending (and IMO too frequent) attempts at
> >>>>>>>>>>>>> trying
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>> reconnect. I've a thread dump too which shows that the other
> >>>>>>>>>>>>> thread
> >>>>>>>>>>>>>
> >>>>>>>>>>>> which
> >>>>>>>>>>> is trying to complete a controlled shutdown of Kafka is blocked
> >>>>>>>>>>>>> forever
> >>>>>>>>>>>>> for
> >>>>>>>>>>>>> the zookeeper to be up. I can attach it to the JIRA.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 2015-01-24 10:15:46,278] WARN Session 0x14b1a4136800000 for
> >>>>>>>>>>>>> server
> >>>>>>>>>>>>>
> >>>>>>>>>>>> null,
> >>>>>>>>>>> unexpected error, closing socket connection and attempting
> >>>>>>>>>>> reconnect
> >>>>>>>>>>>>> (org.apache.zookeeper.ClientCnxn)
> >>>>>>>>>>>>> java.net.ConnectException: Connection refused
> >>>>>>>>>>>>>          at sun.nio.ch.SocketChannelImpl.checkConnect(Native
> >>>>>>>>>>>>> Method)
> >>>>>>>>>>>>>          at sun.nio.ch.SocketChannelImpl.finishConnect(
> >>>>>>>>>>>>> SocketChannelImpl.java:739)
> >>>>>>>>>>>>>          at 
> >>>>>>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(
> >>>>>>>>>>>>> ClientCnxnSocketNIO.java:361)
> >>>>>>>>>>>>>          at org.apache.zookeeper.ClientCnxn$SendThread.run(
> >>>>>>>>>>>>> ClientCnxn.java:1081)
> >>>>>>>>>>>>> [2015-01-24 10:15:47,437] INFO Opening socket connection to
> >>>>>>>>>>>>> server
> >>>>>>>>>>>>> localhost/127.0.0.1:2181. Will not attempt to authenticate using
> >>>>>>>>>>>>> SASL
> >>>>>>>>>>>>> (unknown error) (org.apache.zookeeper.ClientCnxn)
> >>>>>>>>>>>>> [2015-01-24 10:15:47,438] WARN Session 0x14b1a4136800000 for
> >>>>>>>>>>>>> server
> >>>>>>>>>>>>>
> >>>>>>>>>>>> null,
> >>>>>>>>>>> unexpected error, closing socket connection and attempting
> >>>>>>>>>>> reconnect
> >>>>>>>>>>>>> (org.apache.zookeeper.ClientCnxn)
> >>>>>>>>>>>>> java.net.ConnectException: Connection refused
> >>>>>>>>>>>>>          at sun.nio.ch.SocketChannelImpl.checkConnect(Native
> >>>>>>>>>>>>> Method)
> >>>>>>>>>>>>>          at sun.nio.ch.SocketChannelImpl.finishConnect(
> >>>>>>>>>>>>> SocketChannelImpl.java:739)
> >>>>>>>>>>>>>          at 
> >>>>>>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(
> >>>>>>>>>>>>> ClientCnxnSocketNIO.java:361)
> >>>>>>>>>>>>>          at org.apache.zookeeper.ClientCnxn$SendThread.run(
> >>>>>>>>>>>>> ClientCnxn.java:1081)
> >>>>>>>>>>>>> [2015-01-24 10:15:49,056] INFO Opening socket connection to
> >>>>>>>>>>>>> server
> >>>>>>>>>>>>> localhost/127.0.0.1:2181. Will not attempt to authenticate using
> >>>>>>>>>>>>> SASL
> >>>>>>>>>>>>> (unknown error) (org.apache.zookeeper.ClientCnxn)
> >>>>>>>>>>>>> [2015-01-24 10:15:49,057] WARN Session 0x14b1a4136800000 for
> >>>>>>>>>>>>> server
> >>>>>>>>>>>>>
> >>>>>>>>>>>> null,
> >>>>>>>>>>> unexpected error, closing socket connection and attempting
> >>>>>>>>>>> reconnect
> >>>>>>>>>>>>> (org.apache.zookeeper.ClientCnxn)
> >>>>>>>>>>>>> java.net.ConnectException: Connection refused
> >>>>>>>>>>>>>          at sun.nio.ch.SocketChannelImpl.checkConnect(Native
> >>>>>>>>>>>>> Method)
> >>>>>>>>>>>>>          at sun.nio.ch.SocketChannelImpl.finishConnect(
> >>>>>>>>>>>>> SocketChannelImpl.java:739)
> >>>>>>>>>>>>>          at 
> >>>>>>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(
> >>>>>>>>>>>>> ClientCnxnSocketNIO.java:361)
> >>>>>>>>>>>>>          at org.apache.zookeeper.ClientCnxn$SendThread.run(
> >>>>>>>>>>>>> ClientCnxn.java:1081)
> >>>>>>>>>>>>> [2015-01-24 10:15:50,801] INFO Opening socket connection to
> >>>>>>>>>>>>> server
> >>>>>>>>>>>>> localhost/127.0.0.1:2181. Will not attempt to authenticate using
> >>>>>>>>>>>>> SASL
> >>>>>>>>>>>>> (unknown error) (org.apache.zookeeper.ClientCnxn)
> >>>>>>>>>>>>> [2015-01-24 10:15:50,802] WARN Session 0x14b1a4136800000 for
> >>>>>>>>>>>>> server
> >>>>>>>>>>>>>
> >>>>>>>>>>>> null,
> >>>>>>>>>>> unexpected error, closing socket connection and attempting
> >>>>>>>>>>> reconnect
> >>>>>>>>>>>>> (org.apache.zookeeper.ClientCnxn)
> >>>>>>>>>>>>> java.net.ConnectException: Connection refused
> >>>>>>>>>>>>>          at sun.nio.ch.SocketChannelImpl.checkConnect(Native
> >>>>>>>>>>>>> Method)
> >>>>>>>>>>>>>          at sun.nio.ch.SocketChannelImpl.finishConnect(
> >>>>>>>>>>>>> SocketChannelImpl.java:739)
> >>>>>>>>>>>>>          at 
> >>>>>>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(
> >>>>>>>>>>>>> ClientCnxnSocketNIO.java:361)
> >>>>>>>>>>>>>          at org.apache.zookeeper.ClientCnxn$SendThread.run(
> >>>>>>>>>>>>> ClientCnxn.java:1081)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -Jaikiran
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>    --
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Ewen
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>> --
> >>>>> -- Guozhang
> >>>
> 

Reply via email to