[
https://issues.apache.org/jira/browse/KAFKA-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manikumar resolved KAFKA-2553.
------------------------------
Resolution: Fixed
Similar issues areĀ in KAFKA-2169. Pl reopen if the issue still exists
> Kafka Consumer Hangs after Network Partition
> --------------------------------------------
>
> Key: KAFKA-2553
> URL: https://issues.apache.org/jira/browse/KAFKA-2553
> Project: Kafka
> Issue Type: Bug
> Components: consumer
> Affects Versions: 0.8.1.1
> Environment: Amazon EC2, Ubuntu 12.04.
> Reporter: Aaditya Ramesh
> Assignee: Neha Narkhede
> Priority: Major
> Attachments: kafka_bug_report
>
>
> We have a Kafka consumer in an EC2 instance in Ireland that fetches data from
> a kafka cluster in a datacenter in the eastern United States. We sporadically
> encounter strange network partitions where we are unable to ping any machines
> between the two data centers (the ping always times out), but this kind of
> network partition is not too strange for inter-data center connections.
> However, Kafka consumer's connection to Zookeeper never recovers after one of
> these network hiccups and requires a full process restart in order to begin
> consuming from the remote data center after the network has recovered. The
> relevant code in ZookeeperConsumerConnector.scala catches all Throwables and
> does nothing with them, which not only doesn't alert the process, but also
> doesn't display any alerting metrics that we could use to diagnose such a
> hung state. We therefore patched the client code in our codebase to perform a
> System.exit(0) whenever this occurs, since a restart is better than failing
> silently.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)