[ https://issues.apache.org/jira/browse/KAFKA-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias J. Sax updated KAFKA-7074: ----------------------------------- Component/s: (was: streams) core > Race condition between Kafka log deletion and compaction > -------------------------------------------------------- > > Key: KAFKA-7074 > URL: https://issues.apache.org/jira/browse/KAFKA-7074 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.10.2.1 > Reporter: Phil Mikhailov > Priority: Major > > We have one of standard options of deploying Kafka-based EventSourcing > microservice - deployment with reset. It includes running > {{kafka-reset-tool}} before deploying and running microservice which should > reset offsets for input topics and delete intermediate topics to allow > microservice rebuild its internal state. > We have faced a problem several times (its not 100% reproducible) that > {{LogCleaner}} crashed with {{NoSuchFileException}} during compaction routing > after such deployment. See detailed log: > {code} > Disable possible compaction: [2018-06-14 18:25:01,464] INFO > The cleaning for partition > reports-app-ReportsContractExpansionStateStore-changelog-7 is aborted > Deletion happended: [2018-06-14 18:25:56,761] INFO > Deleting index > /var/lib/kafka/data/reports-app-ReportsContractExpansionStateStore-changelog-7/00000000000000000000.timeindex > (kafka.log.TimeIndex) > Compaction failed 'cause it can't find file: [2018-06-14 18:25:57,402] ERROR > [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) > > kafka.common.KafkaStorageException: Failed to change the timeindex file > suffix from to .deleted for log segment 0 > at > kafka.log.LogSegment.kafkaStorageException$1(LogSegment.scala:340) > at > kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:350) > at > kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:981) > at > kafka.log.Log$$anonfun$replaceSegments$1.apply(Log.scala:1027) > at > kafka.log.Log$$anonfun$replaceSegments$1.apply(Log.scala:1022) > at > scala.collection.immutable.List.foreach(List.scala:381) > at > kafka.log.Log.replaceSegments(Log.scala:1022) > at > kafka.log.Cleaner.cleanSegments(LogCleaner.scala:426) > at > kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363) > at > kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362) > at > scala.collection.immutable.List.foreach(List.scala:381) > at > kafka.log.Cleaner.clean(LogCleaner.scala:362) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241) > at > kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220) > at > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > Caused by: > java.nio.file.NoSuchFileException: > /var/lib/kafka/data/reports-app-ReportsContractExpansionStateStore-changelog-7/00000000000000000000.timeindex > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) > at > sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) > at > java.nio.file.Files.move(Files.java:1395) > at > org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:711) > at > kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:126) > ... 14 more > Suppressed: > java.nio.file.NoSuchFileException: > /var/lib/kafka/data/reports-app-ReportsContractExpansionStateStore-changelog-7/00000000000000000000.timeindex > -> > > /var/lib/kafka/data/reports-app-ReportsContractExpansionStateStore-changelog-7/00000000000000000000.timeindex.deleted > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at > sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396) > at > sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) > at > java.nio.file.Files.move(Files.java:1395) > at > org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:708) > ... 15 > more > {code} > It looks like disabling compaction while deleting log doesn't work. > After that compaction stops working and space consumption grows and there is > only one solution to this problem is to restart broker. -- This message was sent by Atlassian JIRA (v7.6.3#76005)