[ https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ray Chiang resolved KAFKA-5153. ------------------------------- Resolution: Information Provided Fix Version/s: 0.11.0.1 Upgrading fixed the problem based on the last comment. > KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting > --------------------------------------------------------------------------- > > Key: KAFKA-5153 > URL: https://issues.apache.org/jira/browse/KAFKA-5153 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.10.2.0, 0.11.0.0 > Environment: RHEL 6 > Java Version 1.8.0_91-b14 > Reporter: Arpan > Priority: Critical > Labels: reliability > Fix For: 0.11.0.1 > > Attachments: ThreadDump_1493564142.dump, ThreadDump_1493564177.dump, > ThreadDump_1493564249.dump, server.properties, server_1_72server.log, > server_2_73_server.log, server_3_74Server.log > > > Hi Team, > I was earlier referring to issue KAFKA-4477 because the problem i am facing > is similar. I tried to search the same reference in release docs as well but > did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using > 2.11_0.10.2.0. > I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set > of servers in cluster mode. We are having around 240GB of data getting > transferred through KAFKA everyday. What we are observing is disconnect of > the server from cluster and ISR getting reduced and it starts impacting > service. > I have also observed file descriptor count getting increased a bit, in normal > circumstances we have not observed FD count more than 500 but when issue > started we were observing it in the range of 650-700 on all 3 servers. > Attaching thread dumps of all 3 servers when we started facing the issue > recently. > The issue get vanished once you bounce the nodes and the set up is not > working more than 5 days without this issue. Attaching server logs as well. > Kindly let me know if you need any additional information. Attaching > server.properties as well for one of the server (It's similar on all 3 > serversP) -- This message was sent by Atlassian JIRA (v7.6.3#76005)