Hi Guys,

I’m seeing the following errors from the 0.8.1.1 broker. This occurs most often 
on the Controller machine. Then the controller process crashes, and the 
controller bounces to other machines, which causes those machines to crash. 
Looking at the file descriptors being held by the process, it’s only around 
4000 or so(looking at . There aren’t a whole lot of connections in TIME_WAIT 
states, and I’ve increased the ephemeral port range to “16000 – 64000” via 
"/proc/sys/net/ipv4/ip_local_port_range”. I’ve written a Java test program to 
see how many sockets and files I can open. The socket is definitely limited by 
the ephemeral port range, which was around 22K at the time. But I
can open tons of files, since the open file limit of the user is set to 100K.

So given that I can theoretically open 48K sockets and probably 90K files, and 
I only see around 4K total for the Kafka broker, I’m really confused as to why 
I’m seeing this error. Is there some internal Kafka limit that I don’t know 
about?

Paul Lung



java.io.IOException: Too many open files

        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)

        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163)

        at kafka.network.Acceptor.accept(SocketServer.scala:200)

        at kafka.network.Acceptor.run(SocketServer.scala:154)

        at java.lang.Thread.run(Thread.java:679)

[2014-07-08 13:07:21,534] ERROR Error in acceptor (kafka.network.Acceptor)

java.io.IOException: Too many open files

        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)

        at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163)

        at kafka.network.Acceptor.accept(SocketServer.scala:200)

        at kafka.network.Acceptor.run(SocketServer.scala:154)

        at java.lang.Thread.run(Thread.java:679)

[2014-07-08 13:07:21,563] ERROR [ReplicaFetcherThread-3-2124488], Error for 
partition [bom__021____active_80__32__mini____activeitem_lvs_qn,0] to broker 
2124488:class kafka.common.NotLeaderForPartitionException 
(kafka.server.ReplicaFetcherThread)

[2014-07-08 13:07:21,558] FATAL [Replica Manager on Broker 2140112]: Error 
writing to highwatermark file:  (kafka.server.ReplicaManager)

java.io.FileNotFoundException: 
/ebay/cronus/software/cronusapp_home/kafka/kafka-logs/replication-offset-checkpoint.tmp
 (Too many open files)

        at java.io.FileOutputStream.open(Native Method)

        at java.io.FileOutputStream.<init>(FileOutputStream.java:209)

        at java.io.FileOutputStream.<init>(FileOutputStream.java:160)

        at java.io.FileWriter.<init>(FileWriter.java:90)

        at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:37)

        at 
kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:447)

        at 
kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:444)

        at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)

        at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)

        at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)

        at 
kafka.server.ReplicaManager.checkpointHighWatermarks(ReplicaManager.scala:444)

        at 
kafka.server.ReplicaManager$$anonfun$1.apply$mcV$sp(ReplicaManager.scala:94)

        at kafka.utils.KafkaScheduler$$anon$1.run(KafkaScheduler.scala:100)

        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

        at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)

        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)

        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)

        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

        at java.lang.Thread.run(Thread.java:679)



Reply via email to