[ https://issues.apache.org/jira/browse/KAFKA-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553882#comment-14553882 ]
Albert Visagie commented on KAFKA-2201: --------------------------------------- Excerpt from lsof: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 29258 root DEL REG 8,49 1654802 /tmp/kafka-logs/xxxxx.phid-3/00000000000092175377.index.deleted java 29258 root mem REG 8,49 55816 1687762 /tmp/kafka-logs/xxxxx.zzzzzupdate.phid-0/00000000000041645890.index java 29258 root DEL REG 8,49 1851417 /tmp/kafka-logs/xxxxx.phid-2/00000000000092164826.index.deleted java 29258 root mem REG 8,49 55760 802895 /tmp/kafka-logs/xxxxx.zzzzzupdate.phid-3/00000000000041748899.index java 29258 root DEL REG 8,49 1867784 /tmp/kafka-logs/xxxxx.phid-1/00000000000091807304.index.deleted java 29258 root DEL REG 8,49 1679464 /tmp/kafka-logs/xxxxx.phid-0/00000000000091714298.index.deleted java 29258 root DEL REG 8,49 1654800 /tmp/kafka-logs/xxxxx.phid-3/00000000000091778942.index.deleted java 29258 root DEL REG 8,49 1851413 /tmp/kafka-logs/xxxxx.phid-2/00000000000091765907.index.deleted java 29258 root mem REG 8,49 59440 1720341 /tmp/kafka-logs/yyyyy.zzzzzupdate.wwww-3/00000000000018996626.index java 29258 root mem REG 8,49 59432 3514436 /tmp/kafka-logs/yyyyy.zzzzzupdate.wwww-1/00000000000018778185.index java 29258 root DEL REG 8,49 1835168 /tmp/kafka-logs/xxxxx.phid-1/00000000000091409225.index.deleted The .deleted ones grow steadily. > Open file handle leak > --------------------- > > Key: KAFKA-2201 > URL: https://issues.apache.org/jira/browse/KAFKA-2201 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8.2.1 > Environment: Debian Linux 7, 64 bit > Oracle JDK 1.7.0u40, 64-bit > Reporter: Albert Visagie > > The kafka broker crashes with the following stack trace from the server.log > roughly every 18 hours: > [2015-05-19 07:39:22,924] FATAL [KafkaApi-0] Halting due to unrecoverable I/O > error while handling produce request: (kafka.server.KafkaApis) > kafka.common.KafkaStorageException: I/O exception in append to log 'nnnnnnn-1' > at kafka.log.Log.append(Log.scala:266) > at > kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379) > at > kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at kafka.utils.Utils$.inReadLock(Utils.scala:541) > at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365) > at > kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291) > at > kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at kafka.server.KafkaApis.appendToLocalLog(KafkaApis.scala:282) > at > kafka.server.KafkaApis.handleProducerOrOffsetCommitRequest(KafkaApis.scala:204) > at kafka.server.KafkaApis.handle(KafkaApis.scala:59) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59) > at java.lang.Thread.run(Thread.java:724) > Caused by: java.io.IOException: Map failed > at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:888) > at > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:286) > at > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:276) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at kafka.log.OffsetIndex.resize(OffsetIndex.scala:276) > at > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply$mcV$sp(OffsetIndex.scala:265) > at > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265) > at > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at kafka.log.OffsetIndex.trimToValidSize(OffsetIndex.scala:264) > at kafka.log.Log.roll(Log.scala:563) > at kafka.log.Log.maybeRoll(Log.scala:539) > at kafka.log.Log.append(Log.scala:306) > ... 21 more > Caused by: java.lang.OutOfMemoryError: Map failed > at sun.nio.ch.FileChannelImpl.map0(Native Method) > at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:885) > ... 33 more > The Kafka broker's open filehandles as seen by > lsof | grep pid | wc -l > grows steadily as it runs. Under our load it lasts about 18 hours before > crashing with the stack trace above. > We were experimenting with settings under Log Retention Policy in > server.properties: > log.retention.hours=168 > log.retention.bytes=107374182 > log.segment.bytes=1073741 > log.retention.check.interval.ms=3000 > The result is that the broker rolls over segments quite rapidly. We don't > have to run it that way of course. > We are running only one broker at the moment. > lsof shows many open files without size and absent from ls in the log > directory with the suffix ".deleted" > This is kafka 0.8.2.1 with scala 2.10.4 as downloaded from the website last > week. -- This message was sent by Atlassian JIRA (v6.3.4#6332)