What does the output of:

lsof -p <broker-pid>

show?

-Jaikiran

On Monday 12 September 2016 10:03 PM, Michael Sparr wrote:
5-node Kafka cluster, bare metal, Ubuntu 14.04.x LTS with 64GB RAM, 8-core, 
960GB SSD boxes and a single node in cluster is filling logs with the following:

[2016-09-12 09:34:49,522] ERROR Error while accepting connection 
(kafka.network.Acceptor)
java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        at kafka.network.Acceptor.accept(SocketServer.scala:323)
        at kafka.network.Acceptor.run(SocketServer.scala:268)
        at java.lang.Thread.run(Thread.java:745)

No other nodes in cluster have this issue. Separate application server has 
consumers/producers using librdkafka + confluent kafka python library with a 
few million messages published to under 100 topics.

For days now the /var/log/kafka/kafka.server.log.N are filling up server with this 
message and using up all space on only a single server node in cluster. I have 
soft/hard limits at 65,535 for all users so > ulimit -n reveals 65535

Is there a setting I should add from librdkafka config in the Python producer 
clients to shorten socket connections even further to avoid this or something 
else going on?

Should I write this as issue in Github repo and if so, which project?


Thanks!



Reply via email to