[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14227033#comment-14227033 ] Guozhang Wang commented on KAFKA-992: - We have seen some scenarios which are not fully resolved by this patch: under certain cases the ephemeral node are not deleted ever after the session has expired (there is a ticket ZOOKEEPER-1208 for this and it is marked to be fixed in 3.3.4, but we are still seeing this issue with a newer version). For this corner case one thing we can do (or more precisely hack around) is to force-delete the ZK path when the written timestamp and the current timestamp's difference is larger than the ZK session timeout value already. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v10.patch, > KAFKA-992.v11.patch, KAFKA-992.v12.patch, KAFKA-992.v13.patch, > KAFKA-992.v14.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, > KAFKA-992.v4.patch, KAFKA-992.v5.patch, KAFKA-992.v6.patch, > KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742837#comment-13742837 ] Jun Rao commented on KAFKA-992: --- Thanks for patch v14. Committed to 0.8. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v11.patch, > KAFKA-992.v12.patch, KAFKA-992.v13.patch, KAFKA-992.v14.patch, > KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, > KAFKA-992.v4.patch, KAFKA-992.v5.patch, KAFKA-992.v6.patch, > KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742385#comment-13742385 ] Jun Rao commented on KAFKA-992: --- Thanks for patch v13. Looks good. My only suggestion is to set leaderId to -1 in ZookeeperLeaderElector.resign() if we want to keep it. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v11.patch, > KAFKA-992.v12.patch, KAFKA-992.v13.patch, KAFKA-992.v1.patch, > KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch, > KAFKA-992.v5.patch, KAFKA-992.v6.patch, KAFKA-992.v7.patch, > KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742356#comment-13742356 ] Guozhang Wang commented on KAFKA-992: - Thanks for the comments Jun. 114. Agreed, deleted the Controller.scala and moved logic to KafkaController object. 120. I thought ZookeeperLeaderElector.resign() is a public function that can be called by the parent process of the Elector. Currently ZookeeperLeaderElector is dependent on KafkaController (it takes controllerContext as its parameters), but I think it would be refactored in the future as an independent election module? 121. In this case what really happens is that another broker has elected as the leader but somehow gets "resigned". This will trigger another election round. So instead of log an error we would better log it as warn and set leaderId to -1? > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v11.patch, > KAFKA-992.v12.patch, KAFKA-992.v13.patch, KAFKA-992.v1.patch, > KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch, > KAFKA-992.v5.patch, KAFKA-992.v6.patch, KAFKA-992.v7.patch, > KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742329#comment-13742329 ] Jun Rao commented on KAFKA-992: --- Thanks for patch v12. A few more comments. 114. If that's the intention, we should put the logic in object KafkaController, which already exists. Also, the comment above the class is incorrect. 120. ZookeeperLeaderElector.resign() is no longer being used and can be removed. 121. ZookeeperLeaderElector.elect(): Just to be consistent with case e2, shouldn't we just log an error and set leaderId to -1 in the following case too? case None => throw new KafkaException("Controller doesn't exist") > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v11.patch, > KAFKA-992.v12.patch, KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, > KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742296#comment-13742296 ] Guozhang Wang commented on KAFKA-992: - As few more thoughts about 113: currently leaderId is only read by amILeader, which is only called by the end of elect to determine if election succeeds or not. At controller shutdown, it will try to read the controller id from ZK again instead of directly use leaderId. Hence if we do not want to make this exception a fatal one and shutdown the whole broker setting leaderId to -1 and logging the error is OK. This conclusion is based on the assumption that if all the brokers failed on election and no one becomes the leader then it is supposed to be a fatal error and should be detected by monitoring. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v11.patch, > KAFKA-992.v12.patch, KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, > KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741587#comment-13741587 ] Guozhang Wang commented on KAFKA-992: - Thanks for the comments, Neha, Jun. And sorry for these typos.. Neha: 1. Done. 2. Done. 3. Done. 4. Done. Jun: 110.1 Done. 110.2. Done. 110.3. Done. 110.4. Done. 110.5. Done. 111. Done. 112. Done. 113. As by the meaning of "resign", which indicates a valid leader actively resign its role as the leader, deleting its election path is the correct way of resigning. The question here is that upon receiving a non-ZkNodeExistsException should we really call resign or not. I am proposing not and instead just logging the error and setting leaderId = -1. 114. Removed unused imports. As for renaming, my expectation is that Controller object might be extend just as Broker object in Broker.scala , which will be used to create a Controller class instance when more fields are added to Controller besides just the ID. So I would propose keep its name and its location in kafka.cluster for now. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v11.patch, > KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, > KAFKA-992.v4.patch, KAFKA-992.v5.patch, KAFKA-992.v6.patch, > KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741206#comment-13741206 ] Jun Rao commented on KAFKA-992: --- Thanks for patch v11. Much better, but can still be improved. Some more comments: 110. ZkUtils.createEphemeralPathExpectConflictHandleZKBug: 110.1 Could we list the ZK jira related to this bug? If that doesn't exist, create a new one. That way, we can track when the bug is fixed. 110.2 Typos in the comment: ata and NodeExistEception 110.3 Is it better to change caller to expectedCallerData 110.4 Could you explain what checker() does in the comment? 110.5 It would be useful to log the ZK path in addition to the value. 111. ZkUtils.registerBrokerInZk: Is it better to rename selfBroker to expectedBroker? 112. KafkaController: typo zkSessionTimout 113. ZookeeperLeaderElector.elect(): I am bit confused what resign() should do. It seems that it needs to either reset leaderId or throw an exception to the caller. 114. Controller: I think it's better to rename it to ControllerUtils. Also, remove unused imports. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v11.patch, > KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, > KAFKA-992.v4.patch, KAFKA-992.v5.patch, KAFKA-992.v6.patch, > KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741179#comment-13741179 ] Neha Narkhede commented on KAFKA-992: - Overall, v11 is a good refactor. Few minor formatting comments - 1. Broker I think getZkString can be removed. This is a nice to have clean up item, not introduced in your patch 2. ZkUtils 2.1 Can we break the long log line that says "A broker is already registered..." 2.2 Typo in the comments above createEphemeralPathExpectConflictHandleZKBug() => NodeExistEception 3. KafkaController Typo => zkSessionTimout 4. Controller Can we add back the following statement in warn. It was helpful for me to know this while testing a cluster upgrade with this patch - warn("Failed to parse the controller info as json. " + "Probably this controller is still using the old format [%s] of storing the broker id in the zookeeper path".format(controller)) > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v11.patch, > KAFKA-992.v1.patch, KAFKA-992.v2.patch, KAFKA-992.v3.patch, > KAFKA-992.v4.patch, KAFKA-992.v5.patch, KAFKA-992.v6.patch, > KAFKA-992.v7.patch, KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739857#comment-13739857 ] Jun Rao commented on KAFKA-992: --- v10 doesn't seem to apply to current 0.8. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v10.patch, KAFKA-992.v1.patch, > KAFKA-992.v2.patch, KAFKA-992.v3.patch, KAFKA-992.v4.patch, > KAFKA-992.v5.patch, KAFKA-992.v6.patch, KAFKA-992.v7.patch, > KAFKA-992.v8.patch, KAFKA-992.v9.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737228#comment-13737228 ] Neha Narkhede commented on KAFKA-992: - +1 on 80. That's a great suggestion, Jun! > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, > KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13736988#comment-13736988 ] Jun Rao commented on KAFKA-992: --- Thanks for patch v8. I think the code can still be made cleaner. 80. If you look at the code in ZkUtils.registerBrokerInZk(), ZookeeperConsumerConnecter.registerConsumerInZK() and ZookeeperLeaderEelection.elect(), they all have the logic for handling the ZK bug. They only differ slightly because the way that they check whether the registration is from the same client is different. I was thinking that we can write a new util function called sth like createEphemeralPathExpectConflictHandleZKBug(). This function will take a function that checks if the value in a ZK path is from the caller. The function will then keep trying to create the path until either it detects a value is put in by a different caller or the creation succeeds. We will get several benefits if you do that: (1) there is a centralized place to handle the ZK bug and therefore we avoid code duplication; (2) this separates the logic of handling the ZK bug from the rest of the logic in the caller, which will make the latter easier to understand; (3) it makes it easier to remove the logic in the future when the ZK bug is fixed. 81. In ZookeeperLeaderEelection.elect(), we also have the logic to handle different formats of the value of the controller path. It seems that can probably be simplified a bit too. Basically, if we read the old format (in the new code), we can treat it as if someone else already did the registration. 82. There is code duplication in ZkUtils.getController() and ZookeeperLeaderElection.LeaderChangeListener.handleDataChange(). Could we share the logic in a separate util? > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, > KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735437#comment-13735437 ] Neha Narkhede commented on KAFKA-992: - +1 on v8. Good catch! > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, > KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734999#comment-13734999 ] Neha Narkhede commented on KAFKA-992: - I agree with Guozhang that the logic to ensure we get around the de-registration issue is very nuanced to the specific path and semantics of that path. +1 on the latest patch. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, > KAFKA-992.v6.patch, KAFKA-992.v7.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734994#comment-13734994 ] Guozhang Wang commented on KAFKA-992: - Thanks for the comments Jun. I think the re-registering logic is slightly different for broker, controller and consumer: Broker: need to check hostname + port Controller: only need to check brokerId Consumer: need not check anything since the consumer info like hostname and port is encoded in the ZkPath. So I think it is hard to unify consumer's logic with broker and controller's logic; it is possible to unify the broker and controller's logic though, by passing the list of json fields that we need to check. But I am not sure it worth the effort. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, > KAFKA-992.v6.patch, KAFKA-992.v7.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734903#comment-13734903 ] Jun Rao commented on KAFKA-992: --- Thanks for the patch. It doesn't seem to apply for me. Do you need to rebase? Just one quick comment. It seems there is common code in the re-registering logic of broker, controller and consumer. Instead of duplicating the code, could we create a common util to share the code? > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, > KAFKA-992.v6.patch, KAFKA-992.v7.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734114#comment-13734114 ] Neha Narkhede commented on KAFKA-992: - Thanks for the follow up patch Guozhang. Overall, looks correct. Few minor suggestions - 9. ZkUtils 9.1. Could you add more details in the log message when the json parsing of the controller path fails? Since we know we are changing the format, something along the lines of "Json parsing of the controller path failed. Probably this controller is still using the old format [%s] of storing the broker id in the zookeeper path" 9.2 We don't need to convert the controller variable to string since it is already a string 9.3 Improve the error message when both json parsing and the toInt conversion fails. "Failed to parse the leader leaderinfo " doesn't say that we failed to parse the controller's leader election path. 10. ZookeeperLeaderElector 10.1 Remove unused import BrokerNotAvailableException 10.2 In elect() API, should'nt we use readDataMaybeNull instead of readData? That covers the case if the ephemeral node disappears before you get a chance to read it. 10.3 Since the changes to elect() are new, I suggest we convert the debug to info or warn statements. This elect() is rarely called, this will not pollute the log. 10.4 One suggestion to reduce code and make it somewhat cleaner - If we change electFinished to electionNotDone, we need to change it only in one place - where we don't need to retry. Currently we have to change electFinished multiple times at different places > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, KAFKA-992.v6.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732132#comment-13732132 ] Neha Narkhede commented on KAFKA-992: - Thanks for the follow up patch. The changes to consumer look good. I have a few concerns about the changes to controller - 1. ZookeeperLeaderElector 1.1 This change is backwards incompatible. Unfortunately, when we versioned the zookeeper data, we left out the controller path. So we have to handle both the previous format and the new format in the code until the old format can be phased out. This will be hacky but we cannot accept the change since that would require downtime at relase 1.2 We have moved to using json for zookeeper data. It will be good if we can follow that while making this change to the controller path 1.3 The while loop has a lot of return statements. How about refactoring it to have while(!writeSucceeded) {} and keep the return amILeader at the very end ? > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731353#comment-13731353 ] Guozhang Wang commented on KAFKA-992: - The zookeeper bug can be reproduced as follows: 1. Checkout a clean 0.8 branch, revert back the KAFKA-992 fix: 2. Build and create a server connecting to a Zookeeper instance (make sure maxClientCnxns = 0 in ZK config so that one IP address can create as many connections as wanted) 3. Load the Zookeeper with dummy sessions, each creates and maintains a thousand ephemeral nodes. 4. Write a script that pause and resume the Zookeeper process continuously, for example: --- while true do kill -STOP $1 sleep 8 kill -CONT $1 sleep 60 done --- 5. Then when the Zookeeper process resumes, it will mark all sessions as timeout, but since the ephemeral nodes to delete are too many, the server's registration node may not be deleted yet when the servers tries to re-register itself and the server think himself as registered successfully. 6. And later Zookeeper will delete the server's registration node without the server's awareness. 7. If we re-apply KAFKA-992's patch, and redo the same testing setup. Under similar conditions the server will wait and retry. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731352#comment-13731352 ] Neha Narkhede commented on KAFKA-992: - We just found a way to reliably reproduce the zookeeper bug and verify that KAFKA-992 fix works. Now, we can fix the controller and consumer the same way. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731337#comment-13731337 ] Joel Koshy commented on KAFKA-992: -- and nm for my comments about controller/consumers as well. For consumers, we don't regenerate the consumer id string. For controller, what can end up happening is: - controller session expires and becomes the controller again (with the stale ephemeral node) - another broker (whose session may not have expired) receives a watch when the stale ephemeral node is actually deleted - so we can end up with two controllers in this scenario. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731282#comment-13731282 ] Joel Koshy commented on KAFKA-992: -- ok nm the comment about timestamp. I had forgotten that nodeexists wouldn't be thrown if the data is the same. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731187#comment-13731187 ] Joel Koshy commented on KAFKA-992: -- Delayed review - looks good to me, although I still don't see a benefit in storing the timestamp. i.e., the approach to retry on nodeexists if the host and port are the same would remain the same. i.e., it seems more for informative purposes. Let me know if I'm missing something. @Jun, you have a point about the controller. It seems it may not be a problem there since controller re-election will happen only after the data is actually deleted. For consumers it may not be an issue either given that the consumer id string includes a random uuid. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13730916#comment-13730916 ] Jun Rao commented on KAFKA-992: --- Thinking about this more. The same ZK issue can affect the controller and the consumer re-registration too. Should those be handled too? > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729652#comment-13729652 ] Neha Narkhede commented on KAFKA-992: - +1 on v4 > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729607#comment-13729607 ] Swapnil Ghike commented on KAFKA-992: - +1 > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch, KAFKA-992.v4.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729569#comment-13729569 ] Jun Rao commented on KAFKA-992: --- It seems that it's going to take some time before this issue is fixed in ZK. So, I suggest that we patch this in 0.8. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728905#comment-13728905 ] Neha Narkhede commented on KAFKA-992: - Thanks for patch v3. Few more review comments - 6. We should get session timeout from KafkaConfig, instead of hardcoding it. 7. It seems like the return should actually be moved inside the try block. That is the only time we don't want to retry since the operation is successful 8. You are right about createEphemeralPathExpectConflict. It already handles 3.1 (in my comments above) This bug is very serious that can halt correct operation of a 0.8 cluster. In a typical production deployment of Kafka where there are many consumers writing offsets to the same zookeeper cluster that the 08 cluster is connected to, there is a higher risk of hitting this bug. On the other hand, you can always increase the session timeout enough to get around this. However, in that case, if a broker crashes or has to be killed, it takes as long as session timeout for the consumers to recover. We have hit this bug in production at LinkedIn several times and have also had to kill 08 brokers due to bugs in controlled shutdown (KAFKA-999). I understand that we want to stop taking patches on 08. We are still on 08 beta in open source. Until trunk is ready to be released, companies that have Kafka 08-beta running in production can run into blocker bugs (KAFKA-992, KAFKA-999). What release pattern can we follow here ? Does it make sense to only take critical fixes on 0.8 and leave other changes to trunk. That allows critical bug fixes to go to production before 0.8.1 is ready for release. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728583#comment-13728583 ] Jay Kreps commented on KAFKA-992: - This is good, this can go on trunk, right? > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728358#comment-13728358 ] Swapnil Ghike commented on KAFKA-992: - Also the while loop should be fixed, the first sleep will lead to return. Also I would use break rather than return. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728354#comment-13728354 ] Swapnil Ghike commented on KAFKA-992: - Makes sense. Just one comment, you can use the session timeout from KafkaConfig, it will give you the value that is being used at runtime. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, > KAFKA-992.v3.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728230#comment-13728230 ] Guozhang Wang commented on KAFKA-992: - Swapnil, we also considered this option. The problem is that zkClient does not expose such kind of information. Hence we came out with the timestamp approach. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728221#comment-13728221 ] Neha Narkhede commented on KAFKA-992: - Swapnil, - You are right in observing that zookeeper stores the session id as part of the znode. However, when a session is established, we don't have access to the session id through ZkClient. So even though session id comparison is the best way to fix the bug, we can't do that. - There are a lot of things that will go wrong if zookeeper is not able to create or expire ephemeral nodes. In such cases, Kafka server will backoff and retry registering, the controller will trigger leader elections repeatedly. So we will know this by through the LeaderElectionRate and UnderReplicatedPartitionCount metrics. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728218#comment-13728218 ] Neha Narkhede commented on KAFKA-992: - Thanks for patch v2, Guozhang. Few review suggestions - 1. How about keeping the unix timestamp as is. All we have to make sure is that it is the equal to what was written. I'm not sure there is an advantage to converting it to some date format. 2. Typo => ephermeral 3. The following log statement is not completely correct - info("I wrote this conflicted ephermeral node a while back in a different session, " + "hence I will backoff for this node to be deleted by Zookeeper after session timeout and retry") The reason is because there are 2 cases when the broker might get NodeExists and the ephemeral node will have the same host and port - 3.1 It ran into one of the recoverable zookeeper errors while creating the ephemeral nodes, in which case ZkClient retried the operation under the covers, and it got a NodeExists error on the 2nd retry. In this case, the timestamp will be useful as it will match what was written and we do not need to retry. 3.2 It hit the zookeeper non-atomic session expiration problem. In this case, the timestamp will not match and we just have to retry. 3.3 The server was killed and restarted within the session timeout. In this case, it is useful to back off for session timeout and retry ephemeral node creation. It will be useful from a logging perspective if we can distinguish between these 3.1 & 3.2/3 cases and retry accordingly. Another way to look at this is to not store the timestamp and just retry on any NodeExists as that has to go through at some point, but we will not get meaningful logging which is not ideal. 4. Regarding the backoff time, I think it is better to backoff for the session timeout 5. Regarding the case where the broker host and port do not match - throw new RuntimeException("A broker is already registered on the path " + brokerIdPath + ". This probably indicates that you either have configured a brokerid that is already in use, or " + "else you have shutdown this broker and restarted it faster than the zookeeper " + "timeout so it appears to be re-registering.") The else part of this statement is incorrect since if you shutdown and restarted the same broker, the broker host and port should in fact match. We should fix the exception message to reflect that another broker [host, port] is registered under that id. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Neha Narkhede >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch > > > The current behavior of zookeeper for ephemeral nodes is that session > expiration and ephemeral node deletion is not an atomic operation. > The side-effect of the above zookeeper behavior in Kafka, for certain corner > cases, is that ephemeral nodes can be lost even if the session is not > expired. The sequence of events that can lead to lossy ephemeral nodes is as > follows - > 1. The session expires on the client, it assumes the ephemeral nodes are > deleted, so it establishes a new session with zookeeper and tries to > re-create the ephemeral nodes. > 2. However, when it tries to re-create the ephemeral node,zookeeper throws > back a NodeExists error code. Now this is legitimate during a session > disconnect event (since zkclient automatically retries the > operation and raises a NodeExists error). Also by design, Kafka server > doesn't have multiple zookeeper clients create the same ephemeral node, so > Kafka server assumes the NodeExists is normal. > 3. However, after a few seconds zookeeper deletes that ephemeral node. So > from the client's perspective, even though the client has a new valid > session, its ephemeral node is gone. > This behavior is triggered due to very long fsync operations on the zookeeper > leader. When the leader wakes up from such a long fsync operation, it has > several sessions to expire. And the time between the session expiration and > the ephemeral node deletion is magnified. Between these 2 operations, a > zookeeper client can issue a ephemeral node creation operation, that could've > appeared to have succeeded, but the leader later deletes the ephemeral node > leading to permanent ephemeral node loss from the client's perspective. > Thread from zookeeper mailing list: > http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A20
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13727438#comment-13727438 ] Swapnil Ghike commented on KAFKA-992: - - I think I am not completely clear why timestamp is required to be stored in zookeeper along with other broker info. If I am not wrong, ephemeralOwner = 0x13ff5a4758c4a05 is the session Id. Is there a way to get it from zookeeper when we read the broker znode info? - Perhaps we should have fixed number of retries. If zookeeper cannot delete the znode after session expiration after sufficient amount of time, we would probably like to know that we are dealing with a buggy zookeeper setup. Then this should suffice: catch ZkNodeExistsException => for (numRetries) { if (broker.host == host && broker.port == port && sessionId == lastSessionId) { Thread.sleep(..) } else { throw new RuntimeException(...) } } > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Guozhang Wang >Assignee: Guozhang Wang > Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch > > > There is a potential bug in Zookeeper that when the ZK leader processes a lot > of session expiration events (this could be due to a long GC or a fsync > operation, etc), it marks the session as expired but does not delete the > corresponding ephemeral znode at the same time. > Meanwhile, a new session event will be fired on the kafka server and the > server will request the same ephemeral node to be created on handling the new > session. When it enters the zookeeper processing queue, this operation > receives a NodeExists error since zookeeper leader has not finished deleting > that ephemeral znode and still thinks the previous session holds it. Kafka > assumes that the NodeExists error on ephemeral node creation is ok since that > is a legitimate condition that happens during session disconnects on > zookeeper. However, a NodeExists error is only valid if the owner session id > also matches Kafka server's current zookeeper session id. The bug is that > before sending a NodeExists error, Zookeeper should check if the ephemeral > node in question is held by a session that has marked as expired. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception
[ https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724789#comment-13724789 ] Guozhang Wang commented on KAFKA-992: - We can differentiate this edge case from a temporal connection loss by adding a timestamp into the broker ZK string so that the conflict will be reflected. Then we can check if the host:port are the same. If this is the case, then we can treat this ephemeral node as written by the broker itself but from a previous session, hence backoff for it to be deleted on session timeout and retry creating the ephemeral node. This will make the temporal connection loss a false positive case, but it should be fine since this case happens rarely. > Double Check on Broker Registration to Avoid False NodeExist Exception > -- > > Key: KAFKA-992 > URL: https://issues.apache.org/jira/browse/KAFKA-992 > Project: Kafka > Issue Type: Bug >Reporter: Guozhang Wang >Assignee: Guozhang Wang > > There is a potential bug in Zookeeper that when the ZK leader processes a lot > of session expiration events (this could be due to a long GC or a fsync > operation, etc), it marks the session as expired but does not delete the > corresponding ephemeral znode at the same time. > Meanwhile, a new session event will be fired on the kafka server and the > server will request the same ephemeral node to be created on handling the new > session. When it enters the zookeeper processing queue, this operation > receives a NodeExists error since zookeeper leader has not finished deleting > that ephemeral znode and still thinks the previous session holds it. Kafka > assumes that the NodeExists error on ephemeral node creation is ok since that > is a legitimate condition that happens during session disconnects on > zookeeper. However, a NodeExists error is only valid if the owner session id > also matches Kafka server's current zookeeper session id. The bug is that > before sending a NodeExists error, Zookeeper should check if the ephemeral > node in question is held by a session that has marked as expired. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira