[ 
https://issues.apache.org/jira/browse/KAFKA-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Firth updated KAFKA-2572:
------------------------------
    Description: 
On two occasions, have seen zk session expiry, followed by a timeout during a 
consumer rebalance following this expiry, followed by multiple successive zk 
session expiries. Restarting the process using the zk client resolved the 
problems. 
Comparing these with a case in which a new stable zk session was created 
following a session expiry, the timeout during rebalance is not seen in the 
successful case.

This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show all 
logs entries minus entries particular to our application. For 09/08, the time 
span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; for 
11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to 
2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining only 
error and warning entries, and entries containing any of: "begin rebalancing", 
"end rebalancing", "timed", and "zookeeper state". For the 09/11 digest logs, 
entries from the kafka.network.Processor logger are also excised for clarity. 
Unfortunately, debug logging was not enabled during these events.




  was:
On two occasions, have seen zk session expiry, followed by a timeout during a 
consumer rebalance following this expiry, followed by multiple successive zk 
session expiries. Restarting the process using the zk client resolved the 
problems. 
Comparing these with a case in which a new stable zk session was created 
following a session expiry, the timeout during rebalance is not seen in the 
successful case.

This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show all 
logs entries minus entries particular to our application. For 09/08, the time 
span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; for 
11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to 
2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining only 
error and warning entries, and entries containing any of: "begin rebalancing", 
"end rebalancing", "timed", and "zookeeper state". For the 09/11 digest logs, 
entries from the kafka.network.Processor logger are also excised for clarity.





> zk connection instability, perhaps precipitated by zk client timeout during 
> rebalance
> -------------------------------------------------------------------------------------
>
>                 Key: KAFKA-2572
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2572
>             Project: Kafka
>          Issue Type: Bug
>          Components: zkclient
>    Affects Versions: 0.8.2.1
>         Environment: zk version 3.4.6,
> CentOS 6, 2.6.32-504.1.3.el6.x86_64
>            Reporter: John Firth
>         Attachments: 090815-digest.log, 090815-full.log, 091115-digest.log, 
> 091115-full.log.zip
>
>
> On two occasions, have seen zk session expiry, followed by a timeout during a 
> consumer rebalance following this expiry, followed by multiple successive zk 
> session expiries. Restarting the process using the zk client resolved the 
> problems. 
> Comparing these with a case in which a new stable zk session was created 
> following a session expiry, the timeout during rebalance is not seen in the 
> successful case.
> This behavior was seen on 09/08 and 09/11 -- the attached 'full' logs show 
> all logs entries minus entries particular to our application. For 09/08, the 
> time span is 2015-09-08T12:52:06.069-04:00 to 2015-09-08T13:14:48.250-04:00; 
> for 11/08, the time span is between 2015-09-11T01:38:17.000-04:00 to 
> 2015-09-11T07:44:47.124-04:00. The digest logs are the result of retaining 
> only error and warning entries, and entries containing any of: "begin 
> rebalancing", "end rebalancing", "timed", and "zookeeper state". For the 
> 09/11 digest logs, entries from the kafka.network.Processor logger are also 
> excised for clarity. Unfortunately, debug logging was not enabled during 
> these events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to