Additional info
Kafka version: 0.8.2.1
zookeeper: 3.4.6

On Wed, 8 Jul 2015 at 20:07 tao xiao <xiaotao...@gmail.com> wrote:

> Hi team,
>
> I have 10 high level consumers connecting to Kafka and one of them kept
> complaining "conflicted ephemeral node" for about 8 hours. The log was
> filled with below exception
>
> [2015-07-07 14:03:51,615] INFO conflict in
> /consumers/group/ids/test-1435856975563-9a9fdc6c data:
> {"version":1,"subscription":{"test.*":1},"pattern":"white_list","timestamp":"1436275631510"}
> stored data:
> {"version":1,"subscription":{"test.*":1},"pattern":"white_list","timestamp":"1436275558570"}
> (kafka.utils.ZkUtils$)
> [2015-07-07 14:03:51,616] INFO I wrote this conflicted ephemeral node
> [{"version":1,"subscription":{"test.*":1},"pattern":"white_list","timestamp":"1436275631510"}]
> at /consumers/group/ids/test-1435856975563-9a9fdc6c a while back in a
> different session, hence I will backoff for this node to be deleted by
> Zookeeper and retry (kafka.utils.ZkUtils$)
>
> In the meantime zookeeper reported below exception for the same time span
>
> 2015-07-07 22:45:09,687 [myid:3] - INFO  [ProcessThread(sid:3
> cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException
> when processing sessionid:0x44e657ff19c0019 type:create cxid:0x7a26
> zxid:0x3015f6e77 txntype:-1 reqpath:n/a Error
> Path:/consumers/group/ids/test-1435856975563-9a9fdc6c Error:KeeperErrorCode
> = NodeExists for /consumers/group/ids/test-1435856975563-9a9fdc6c
>
> At the end zookeeper timed out the session and consumers triggered
> rebalance.
>
> I know that conflicted ephemeral node warning is to handle a zookeeper bug
> that session expiration and ephemeral node deletion are not done atomically
> but as indicated from zookeeper log the zookeeper never got a chance to
> delete the ephemeral node which made me think that the session was not
> expired at that time. And for some reason zookeeper fired session expire
> event which subsequently invoked ZKSessionExpireListener.  I was just
> wondering if anyone have ever encountered similar issue before and what I
> can do at zookeeper side to prevent this?
>
> Another problem is that createEphemeralPathExpectConflictHandleZKBug call
> is wrapped in a while(true) loop which runs forever until the ephemeral
> node is created. Would it be better that we can employ an exponential retry
> policy with a max number of retries so that it has a chance to re-throw the
> exception back to caller and let caller handle it in situation like above?
>
>

Reply via email to