That would solve this. But it looks like a work around. We need to check
why this happens exactly and get to the root cause. What do you think?
Getting to the root cause of this might be really useful.

Thanks,

Mayuresh

On Sun, Jul 12, 2015 at 8:45 PM, tao xiao <xiaotao...@gmail.com> wrote:

> Restart the consumers does fix the issue. But since the zk retry is wrapped
> in an infinite loop it doesn't give a chance to consumer to respond it
> until some one notices and restarts. Why I suggest to have a maximum retry
> policy is if max retry is reached it can invoke a customer handler which I
> can then inject a restart call so that it can remedy itself without
> people's attention.
>
> On Mon, 13 Jul 2015 at 11:36 Jiangjie Qin <j...@linkedin.com.invalid>
> wrote:
>
> > Hi Tao,
> >
> > We see this error from time to time but did not think of this as a big
> > issue. Any reason it bothers you much?
> > I¹m not sure if throwing exception to user on this exception is a good
> > handling or not. What are user supposed to do in that case other than
> > retry?
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On 7/12/15, 7:16 PM, "tao xiao" <xiaotao...@gmail.com> wrote:
> >
> > >We saw the error again in our cluster.  Anyone has the same issue
> before?
> > >
> > >On Fri, 10 Jul 2015 at 13:26 tao xiao <xiaotao...@gmail.com> wrote:
> > >
> > >> Bump the thread. Any help would be appreciated.
> > >>
> > >> On Wed, 8 Jul 2015 at 20:09 tao xiao <xiaotao...@gmail.com> wrote:
> > >>
> > >>> Additional info
> > >>> Kafka version: 0.8.2.1
> > >>> zookeeper: 3.4.6
> > >>>
> > >>> On Wed, 8 Jul 2015 at 20:07 tao xiao <xiaotao...@gmail.com> wrote:
> > >>>
> > >>>> Hi team,
> > >>>>
> > >>>> I have 10 high level consumers connecting to Kafka and one of them
> > >>>>kept
> > >>>> complaining "conflicted ephemeral node" for about 8 hours. The log
> was
> > >>>> filled with below exception
> > >>>>
> > >>>> [2015-07-07 14:03:51,615] INFO conflict in
> > >>>> /consumers/group/ids/test-1435856975563-9a9fdc6c data:
> > >>>>
> >
> >>>>{"version":1,"subscription":{"test.*":1},"pattern":"white_list","timest
> > >>>>amp":"1436275631510"}
> > >>>> stored data:
> > >>>>
> >
> >>>>{"version":1,"subscription":{"test.*":1},"pattern":"white_list","timest
> > >>>>amp":"1436275558570"}
> > >>>> (kafka.utils.ZkUtils$)
> > >>>> [2015-07-07 14:03:51,616] INFO I wrote this conflicted ephemeral
> node
> > >>>>
> >
> >>>>[{"version":1,"subscription":{"test.*":1},"pattern":"white_list","times
> > >>>>tamp":"1436275631510"}]
> > >>>> at /consumers/group/ids/test-1435856975563-9a9fdc6c a while back in
> a
> > >>>> different session, hence I will backoff for this node to be deleted
> by
> > >>>> Zookeeper and retry (kafka.utils.ZkUtils$)
> > >>>>
> > >>>> In the meantime zookeeper reported below exception for the same time
> > >>>>span
> > >>>>
> > >>>> 2015-07-07 22:45:09,687 [myid:3] - INFO  [ProcessThread(sid:3
> > >>>> cport:-1)::PrepRequestProcessor@645] - Got user-level
> KeeperException
> > >>>> when processing sessionid:0x44e657ff19c0019 type:create cxid:0x7a26
> > >>>> zxid:0x3015f6e77 txntype:-1 reqpath:n/a Error
> > >>>> Path:/consumers/group/ids/test-1435856975563-9a9fdc6c
> > >>>>Error:KeeperErrorCode
> > >>>> = NodeExists for /consumers/group/ids/test-1435856975563-9a9fdc6c
> > >>>>
> > >>>> At the end zookeeper timed out the session and consumers triggered
> > >>>> rebalance.
> > >>>>
> > >>>> I know that conflicted ephemeral node warning is to handle a
> zookeeper
> > >>>> bug that session expiration and ephemeral node deletion are not done
> > >>>> atomically but as indicated from zookeeper log the zookeeper never
> > >>>>got a
> > >>>> chance to delete the ephemeral node which made me think that the
> > >>>>session
> > >>>> was not expired at that time. And for some reason zookeeper fired
> > >>>>session
> > >>>> expire event which subsequently invoked ZKSessionExpireListener.  I
> > >>>>was
> > >>>> just wondering if anyone have ever encountered similar issue before
> > >>>>and
> > >>>> what I can do at zookeeper side to prevent this?
> > >>>>
> > >>>> Another problem is that createEphemeralPathExpectConflictHandleZKBug
> > >>>> call is wrapped in a while(true) loop which runs forever until the
> > >>>> ephemeral node is created. Would it be better that we can employ an
> > >>>> exponential retry policy with a max number of retries so that it
> has a
> > >>>> chance to re-throw the exception back to caller and let caller
> handle
> > >>>>it in
> > >>>> situation like above?
> > >>>>
> > >>>>
> >
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Reply via email to