Yup. In fact, I just ran the test program again while the Kafak broker is
still running, using the same user of course. I was able to get up to 10K
connections with the test program. The test program uses the same java NIO
library that the broker does. So the machine is capable of handling that
many connections. The only issue I saw was that the NIO
ServerSocketChannel is a bit slow at accepting connections when the total
connection goes around 4K, but this could be due to the fact that I put
the ServerSocketChannel in the same Selector as the 4K SocketChannels. So
sometimes on the client side, I see:

java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:122)
        at sun.nio.ch.IOUtil.write(IOUtil.java:93)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:352)
        at FdTest$ClientThread.run(FdTest.java:108)


But all I have to do is sleep for a bit on the client, and then retry
again. However, 4K does seem like a magic number, since that¹s seems to be
the number that the Kafka broker machine can handle before it gives me the
³Too Many Open Files² error and eventually crashes.

Paul Lung

On 7/8/14, 9:29 PM, "Jun Rao" <jun...@gmail.com> wrote:

>Does your test program run as the same user as Kafka broker?
>
>Thanks,
>
>Jun
>
>
>On Tue, Jul 8, 2014 at 1:42 PM, Lung, Paul <pl...@ebay.com> wrote:
>
>> Hi Guys,
>>
>> I¹m seeing the following errors from the 0.8.1.1 broker. This occurs
>>most
>> often on the Controller machine. Then the controller process crashes,
>>and
>> the controller bounces to other machines, which causes those machines to
>> crash. Looking at the file descriptors being held by the process, it¹s
>>only
>> around 4000 or so(looking at . There aren¹t a whole lot of connections
>>in
>> TIME_WAIT states, and I¹ve increased the ephemeral port range to ³16000
>>­
>> 64000² via "/proc/sys/net/ipv4/ip_local_port_range². I¹ve written a Java
>> test program to see how many sockets and files I can open. The socket is
>> definitely limited by the ephemeral port range, which was around 22K at
>>the
>> time. But I
>> can open tons of files, since the open file limit of the user is set to
>> 100K.
>>
>> So given that I can theoretically open 48K sockets and probably 90K
>>files,
>> and I only see around 4K total for the Kafka broker, I¹m really
>>confused as
>> to why I¹m seeing this error. Is there some internal Kafka limit that I
>> don¹t know about?
>>
>> Paul Lung
>>
>>
>>
>> java.io.IOException: Too many open files
>>
>>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>
>>         at
>> 
>>sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:16
>>3)
>>
>>         at kafka.network.Acceptor.accept(SocketServer.scala:200)
>>
>>         at kafka.network.Acceptor.run(SocketServer.scala:154)
>>
>>         at java.lang.Thread.run(Thread.java:679)
>>
>> [2014-07-08 13:07:21,534] ERROR Error in acceptor
>>(kafka.network.Acceptor)
>>
>> java.io.IOException: Too many open files
>>
>>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>
>>         at
>> 
>>sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:16
>>3)
>>
>>         at kafka.network.Acceptor.accept(SocketServer.scala:200)
>>
>>         at kafka.network.Acceptor.run(SocketServer.scala:154)
>>
>>         at java.lang.Thread.run(Thread.java:679)
>>
>> [2014-07-08 13:07:21,563] ERROR [ReplicaFetcherThread-3-2124488], Error
>> for partition [bom__021____active_80__32__mini____activeitem_lvs_qn,0]
>>to
>> broker 2124488:class kafka.common.NotLeaderForPartitionException
>> (kafka.server.ReplicaFetcherThread)
>>
>> [2014-07-08 13:07:21,558] FATAL [Replica Manager on Broker 2140112]:
>>Error
>> writing to highwatermark file:  (kafka.server.ReplicaManager)
>>
>> java.io.FileNotFoundException:
>> 
>>/ebay/cronus/software/cronusapp_home/kafka/kafka-logs/replication-offset-
>>checkpoint.tmp
>> (Too many open files)
>>
>>         at java.io.FileOutputStream.open(Native Method)
>>
>>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>>
>>         at java.io.FileOutputStream.<init>(FileOutputStream.java:160)
>>
>>         at java.io.FileWriter.<init>(FileWriter.java:90)
>>
>>         at 
>>kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:37)
>>
>>         at
>> 
>>kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(Rep
>>licaManager.scala:447)
>>
>>         at
>> 
>>kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(Rep
>>licaManager.scala:444)
>>
>>         at
>> 
>>scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(Trav
>>ersableLike.scala:772)
>>
>>         at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
>>
>>         at
>> 
>>scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala
>>:771)
>>
>>         at
>> 
>>kafka.server.ReplicaManager.checkpointHighWatermarks(ReplicaManager.scala
>>:444)
>>
>>         at
>> 
>>kafka.server.ReplicaManager$$anonfun$1.apply$mcV$sp(ReplicaManager.scala:
>>94)
>>
>>         at 
>>kafka.utils.KafkaScheduler$$anon$1.run(KafkaScheduler.scala:100)
>>
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>
>>         at
>> 
>>java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351
>>)
>>
>>         at 
>>java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>>
>>         at
>> 
>>java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.acce
>>ss$201(ScheduledThreadPoolExecutor.java:165)
>>
>>         at
>> 
>>java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(
>>ScheduledThreadPoolExecutor.java:267)
>>
>>         at
>> 
>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java
>>:1110)
>>
>>         at
>> 
>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>a:603)
>>
>>         at java.lang.Thread.run(Thread.java:679)
>>
>>
>>
>>

Reply via email to