[ https://issues.apache.org/jira/browse/KAFKA-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312945#comment-15312945 ]
Buvaneswari Ramanan commented on KAFKA-3689: -------------------------------------------- All connections are shown to be in ESTABLISHED state. Here is the scenario under which this arises: * 8000 producers & 16000 consumers, all utilizing kafka-python library * brokers go thru abnormal shutdown - easy way to create this scenario: * clean shutdown all the zookeepers while brokers are running * await for the following message in broker log: INFO [Kafka Server ], shutting down (kafka.server.KafkaServer). As you are probably aware, even though they start the shutdown, the process takes a while. & brokers will continue to be on until zks are back. * now restart zks so that brokers will shutdown eventually * now start brokers * network.Processor error message appears in atleast one of the brokers within a few hours netstat shows all connections to be ESTABLISHED at the broker end. > ERROR Processor got uncaught exception. (kafka.network.Processor) > ----------------------------------------------------------------- > > Key: KAFKA-3689 > URL: https://issues.apache.org/jira/browse/KAFKA-3689 > Project: Kafka > Issue Type: Bug > Components: network > Affects Versions: 0.9.0.1 > Environment: ubuntu 14.04, > java version "1.7.0_95" > OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.2) > OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode) > 3 broker cluster (all 3 servers identical - Intel Xeon E5-2670 @2.6GHz, > 8cores, 16 threads 64 GB RAM & 1 TB Disk) > Kafka Cluster is managed by 3 server ZK cluster (these servers are different > from Kafka broker servers). All 6 servers are connected via 10G switch. > Producers run from external servers. > Reporter: Buvaneswari Ramanan > Assignee: Jun Rao > Priority: Minor > Fix For: 0.10.1.0, 0.10.0.1 > > Original Estimate: 72h > Remaining Estimate: 72h > > As per Ismael Juma's suggestion in email thread to us...@kafka.apache.org > with the same subject, I am creating this bug report. > The following error occurs in one of the brokers in our 3 broker cluster, > which serves about 8000 topics. These topics are single partitioned with a > replication factor = 3. Each topic gets data at a low rate – 200 bytes per > sec. Leaders are balanced across the topics. > Producers run from external servers (4 Ubuntu servers with same config as the > brokers), each producing to 2000 topics utilizing kafka-python library. > This error message occurs repeatedly in one of the servers. Between the hours > of 10:30am and 1:30pm on 5/9/16, there were about 10 Million such > occurrences. This was right after a cluster restart. > This is not the first time we got this error in this broker. In those > instances, error occurred hours / days after cluster restart. > ===================================================== > [2016-05-09 10:38:43,932] ERROR Processor got uncaught exception. > (kafka.network.Processor) > java.lang.IllegalArgumentException: Attempted to decrease connection count > for address with no connections, address: /X.Y.Z.144 (actual network address > masked) > at > kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565) > at > kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565) > at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) > at scala.collection.AbstractMap.getOrElse(Map.scala:59) > at kafka.network.ConnectionQuotas.dec(SocketServer.scala:564) > at > kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:450) > at > kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:445) > at scala.collection.Iterator$class.foreach(Iterator.scala:742) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at kafka.network.Processor.run(SocketServer.scala:445) > at java.lang.Thread.run(Thread.java:745) > [2016-05-09 10:38:43,932] ERROR Processor got uncaught exception. > (kafka.network.Processor) > java.lang.IllegalArgumentException: Attempted to decrease connection count > for address with no connections, address: /X.Y.Z.144 > at > kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565) > at > kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565) > at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) > at scala.collection.AbstractMap.getOrElse(Map.scala:59) > at kafka.network.ConnectionQuotas.dec(SocketServer.scala:564) > at > kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:450) > at > kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:445) > at scala.collection.Iterator$class.foreach(Iterator.scala:742) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at kafka.network.Processor.run(SocketServer.scala:445) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)