[jira] [Updated] (KAFKA-2572) zk connection instability, perhaps precipitated by zk client timeout during rebalance
[ https://issues.apache.org/jira/browse/KAFKA-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Firth updated KAFKA-2572: -- Attachment: 091115-full.log.zip > zk connection instability, perhaps precipitated by zk client timeout during > rebalance > - > > Key: KAFKA-2572 > URL: https://issues.apache.org/jira/browse/KAFKA-2572 > Project: Kafka > Issue Type: Bug > Components: zkclient >Affects Versions: 0.8.2.1 > Environment: zk version 3.4.6, > CentOS 6, 2.6.32-504.1.3.el6.x86_64 >Reporter: John Firth > Attachments: 090815-digest.log, 090815-full.log, 091115-digest.log, > 091115-full.log.zip > > > On two occasions, have seen zk session expiry, followed by a timeout during a > consumer rebalance following this expiry, followed by multiple successive zk > session expiries. Restarting the process using the zk client resolved the > problems. > Comparing these with a case in which a new stable zk session was created > following a session expiry, the timeout during rebalance is not seen in the > successful case. > This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show > all logs entries minus entries particular to our application. For 09/08, the > time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; > for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to > 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining > only error and warning entries, and entries containing any of: "begin > rebalancing", "end rebalancing", "timed", and "zookeeper state". For the > 09/11 digest logs, entries from the kafka.network.Processor logger are also > excised for clarity. Unfortunately, debug logging was not enabled during > these events. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2572) zk connection instability, perhaps precipitated by zk client timeout during rebalance
[ https://issues.apache.org/jira/browse/KAFKA-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Firth updated KAFKA-2572: -- Description: On two occasions, have seen zk session expiry, followed by a timeout during a consumer rebalance following this expiry, followed by multiple successive zk session expiries. Restarting the process using the zk client resolved the problems. Comparing these with a case in which a new stable zk session was created following a session expiry, the timeout during rebalance is not seen in the successful case. This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show all logs entries minus entries particular to our application. For 09/08, the time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining only error and warning entries, and entries containing any of: "begin rebalancing", "end rebalancing", "timed", and "zookeeper state". For the 09/11 digest logs, entries from the kafka.network.Processor logger are also excised for clarity. Unfortunately, debug logging was not enabled during these events. was: On two occasions, have seen zk session expiry, followed by a timeout during a consumer rebalance following this expiry, followed by multiple successive zk session expiries. Restarting the process using the zk client resolved the problems. Comparing these with a case in which a new stable zk session was created following a session expiry, the timeout during rebalance is not seen in the successful case. This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show all logs entries minus entries particular to our application. For 09/08, the time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining only error and warning entries, and entries containing any of: "begin rebalancing", "end rebalancing", "timed", and "zookeeper state". For the 09/11 digest logs, entries from the kafka.network.Processor logger are also excised for clarity. > zk connection instability, perhaps precipitated by zk client timeout during > rebalance > - > > Key: KAFKA-2572 > URL: https://issues.apache.org/jira/browse/KAFKA-2572 > Project: Kafka > Issue Type: Bug > Components: zkclient >Affects Versions: 0.8.2.1 > Environment: zk version 3.4.6, > CentOS 6, 2.6.32-504.1.3.el6.x86_64 >Reporter: John Firth > Attachments: 090815-digest.log, 090815-full.log, 091115-digest.log, > 091115-full.log.zip > > > On two occasions, have seen zk session expiry, followed by a timeout during a > consumer rebalance following this expiry, followed by multiple successive zk > session expiries. Restarting the process using the zk client resolved the > problems. > Comparing these with a case in which a new stable zk session was created > following a session expiry, the timeout during rebalance is not seen in the > successful case. > This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show > all logs entries minus entries particular to our application. For 09/08, the > time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; > for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to > 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining > only error and warning entries, and entries containing any of: "begin > rebalancing", "end rebalancing", "timed", and "zookeeper state". For the > 09/11 digest logs, entries from the kafka.network.Processor logger are also > excised for clarity. Unfortunately, debug logging was not enabled during > these events. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2572) zk connection instability, perhaps precipitated by zk client timeout during rebalance
[ https://issues.apache.org/jira/browse/KAFKA-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Firth updated KAFKA-2572: -- Attachment: 091115-full.log.zip 091115-digest.log 090815-full.log 090815-digest.log > zk connection instability, perhaps precipitated by zk client timeout during > rebalance > - > > Key: KAFKA-2572 > URL: https://issues.apache.org/jira/browse/KAFKA-2572 > Project: Kafka > Issue Type: Bug > Components: zkclient >Affects Versions: 0.8.2.1 > Environment: zk version 3.4.6, > CentOS 6, 2.6.32-504.1.3.el6.x86_64 >Reporter: John Firth > Attachments: 090815-digest.log, 090815-full.log, 091115-digest.log, > 091115-full.log.zip > > > On two occasions, have seen zk session expiry, followed by a timeout during a > consumer rebalance following this expiry, followed by multiple successive zk > session expiries. Comparing these with a case in which a new stable zk > session was created following a session expiry, the timeout during rebalance > is not seen in the successful case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2572) zk connection instability, perhaps precipitated by zk client timeout during rebalance
[ https://issues.apache.org/jira/browse/KAFKA-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Firth updated KAFKA-2572: -- Attachment: (was: 091115-full.log.zip) > zk connection instability, perhaps precipitated by zk client timeout during > rebalance > - > > Key: KAFKA-2572 > URL: https://issues.apache.org/jira/browse/KAFKA-2572 > Project: Kafka > Issue Type: Bug > Components: zkclient >Affects Versions: 0.8.2.1 > Environment: zk version 3.4.6, > CentOS 6, 2.6.32-504.1.3.el6.x86_64 >Reporter: John Firth > Attachments: 090815-digest.log, 090815-full.log, 091115-digest.log > > > On two occasions, have seen zk session expiry, followed by a timeout during a > consumer rebalance following this expiry, followed by multiple successive zk > session expiries. Restarting the process using the zk client resolved the > problems. > Comparing these with a case in which a new stable zk session was created > following a session expiry, the timeout during rebalance is not seen in the > successful case. > This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show > all logs entries minus entries particular to our application. For 09/08, the > time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; > for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to > 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining > only error and warning entries, and entries containing any of: "begin > rebalancing", "end rebalancing", "timed", and "zookeeper state". For the > 09/11 digest logs, entries from the kafka.network.Processor logger are also > excised for clarity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2572) zk connection instability, perhaps precipitated by zk client timeout during rebalance
[ https://issues.apache.org/jira/browse/KAFKA-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Firth updated KAFKA-2572: -- Description: On two occasions, have seen zk session expiry, followed by a timeout during a consumer rebalance following this expiry, followed by multiple successive zk session expiries. Restarting the process using the zk client resolved the problems. Comparing these with a case in which a new stable zk session was created following a session expiry, the timeout during rebalance is not seen in the successful case. This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show all logs entries minus entries particular to our application. For 09/08, the time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining only error and warning entries, and entries containing any of: "begin rebalancing", "end rebalancing", "timed", and "zookeeper state". For the 09/11 digest logs, entries from the kafka.network.Processor logger are also excised for clarity. was: On two occasions, have seen zk session expiry, followed by a timeout during a consumer rebalance following this expiry, followed by multiple successive zk session expiries. Comparing these with a case in which a new stable zk session was created following a session expiry, the timeout during rebalance is not seen in the successful case. > zk connection instability, perhaps precipitated by zk client timeout during > rebalance > - > > Key: KAFKA-2572 > URL: https://issues.apache.org/jira/browse/KAFKA-2572 > Project: Kafka > Issue Type: Bug > Components: zkclient >Affects Versions: 0.8.2.1 > Environment: zk version 3.4.6, > CentOS 6, 2.6.32-504.1.3.el6.x86_64 >Reporter: John Firth > Attachments: 090815-digest.log, 090815-full.log, 091115-digest.log > > > On two occasions, have seen zk session expiry, followed by a timeout during a > consumer rebalance following this expiry, followed by multiple successive zk > session expiries. Restarting the process using the zk client resolved the > problems. > Comparing these with a case in which a new stable zk session was created > following a session expiry, the timeout during rebalance is not seen in the > successful case. > This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show > all logs entries minus entries particular to our application. For 09/08, the > time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; > for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to > 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining > only error and warning entries, and entries containing any of: "begin > rebalancing", "end rebalancing", "timed", and "zookeeper state". For the > 09/11 digest logs, entries from the kafka.network.Processor logger are also > excised for clarity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2572) zk connection instability, perhaps precipitated by zk client timeout during rebalance
[ https://issues.apache.org/jira/browse/KAFKA-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Firth updated KAFKA-2572: -- Description: On two occasions, have seen zk session expiry, followed by a timeout during a consumer rebalance following this expiry, followed by multiple successive zk session expiries. Restarting the process using the zk client resolved the problems. Comparing these with a case in which a new stable zk session was created following a session expiry, the timeout during rebalance is not seen in the successful case. This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show all logs entries minus entries particular to our application. For 09/08, the time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining only error and warning entries, and entries containing any of: "begin rebalancing", "end rebalancing", "timed", and "zookeeper state". For the 09/11 digest logs, entries from the kafka.network.Processor logger are also excised for clarity. Unfortunately, debug logging was not enabled during these events. The 09/08 case is a little more straightforward than the 09/11 case. In the 09/08 case, a session times out at 2015-09-08T12:52:06.069-04:00; two timeouts for the same session are then seen during the rebalance that follows the establishment of that session, at 2015-09-08T12:52:19.107-04:00 and 2015-09-08T12:52:31.639-04:00. The rebalance begins at 2015-09-08T12:52:06.667-04:00. The connection to ZK then expires and is restablished multiple times before the process is killed after 2015-09-08T13:13:41.655-04:00, which marks the last entry in the digest. The 09/11 case shows repeated cycles of session expiry, followed by rebalancing activity, followed by a pause during which nothing is heard from the zk server, followed by a session timeout. was: On two occasions, have seen zk session expiry, followed by a timeout during a consumer rebalance following this expiry, followed by multiple successive zk session expiries. Restarting the process using the zk client resolved the problems. Comparing these with a case in which a new stable zk session was created following a session expiry, the timeout during rebalance is not seen in the successful case. This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show all logs entries minus entries particular to our application. For 09/08, the time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining only error and warning entries, and entries containing any of: "begin rebalancing", "end rebalancing", "timed", and "zookeeper state". For the 09/11 digest logs, entries from the kafka.network.Processor logger are also excised for clarity. Unfortunately, debug logging was not enabled during these events. > zk connection instability, perhaps precipitated by zk client timeout during > rebalance > - > > Key: KAFKA-2572 > URL: https://issues.apache.org/jira/browse/KAFKA-2572 > Project: Kafka > Issue Type: Bug > Components: zkclient >Affects Versions: 0.8.2.1 > Environment: zk version 3.4.6, > CentOS 6, 2.6.32-504.1.3.el6.x86_64 >Reporter: John Firth > Attachments: 090815-digest.log, 090815-full.log, 091115-digest.log, > 091115-full.log.zip > > > On two occasions, have seen zk session expiry, followed by a timeout during a > consumer rebalance following this expiry, followed by multiple successive zk > session expiries. Restarting the process using the zk client resolved the > problems. > Comparing these with a case in which a new stable zk session was created > following a session expiry, the timeout during rebalance is not seen in the > successful case. > This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show > all logs entries minus entries particular to our application. For 09/08, the > time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; > for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to > 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining > only error and warning entries, and entries containing any of: "begin > rebalancing", "end rebalancing", "timed", and "zookeeper state". For the > 09/11 digest logs, entries from the kafka.network.Processor logger are also > excised for clarity. Unfortunately, debug logging was not enabled during > these events. > The 09/08 case is a little more straightforward than the