[ https://issues.apache.org/jira/browse/KAFKA-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966957#comment-16966957 ]
Karolis Pocius commented on KAFKA-9133: --------------------------------------- Some additional information. Broker restart doesn't help -- as soon as the log cleaner is loaded, it errors out with the same message. It is possible to identify the specific log by looking for base offset like so: `find /path/to/log/dir/ -name *5019648.log` -- at least in our case it was allways the same two changelog topics created by kafka streams (version 2.3.0 and 2.2.1). We have experienced the same issue on all clusters that we've upgraded, four of those were upgraded from 2.2.1 and one from 2.3.0. I have tried a patched version using the code submitted by [~timvanlaer] -- while it does produce a new warning in server.log, the log cleaner still fails with the same error. The workaround we're currently using is reassigning the offending partition to a different broker, then bumping log.cleaner.threads up and down to restart the log cleaner. That way we're avoiding bouncing the brokers as it would require multiple restarts since usually several partitions are affected, but you can only identify them one-by-one. The issue seems to reoccur in an irregular intervals -- sometimes it runs fine for a couple days, sometimes it errors out a couple times a days. cc: [~enether] [~hachikuji] > LogCleaner thread dies with: currentLog cannot be empty on an unexpected > exception > ---------------------------------------------------------------------------------- > > Key: KAFKA-9133 > URL: https://issues.apache.org/jira/browse/KAFKA-9133 > Project: Kafka > Issue Type: Bug > Components: log cleaner > Affects Versions: 2.3.1 > Reporter: Karolis Pocius > Priority: Major > > Log cleaner thread dies without a clear reference to which log is causing it: > {code} > [2019-11-02 11:59:59,078] INFO Starting the log cleaner (kafka.log.LogCleaner) > [2019-11-02 11:59:59,144] INFO [kafka-log-cleaner-thread-0]: Starting > (kafka.log.LogCleaner) > [2019-11-02 11:59:59,199] ERROR [kafka-log-cleaner-thread-0]: Error due to > (kafka.log.LogCleaner) > java.lang.IllegalStateException: currentLog cannot be empty on an unexpected > exception > at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:346) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:307) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89) > Caused by: java.lang.IllegalArgumentException: Illegal request for non-active > segments beginning at offset 5033130, which is larger than the active > segment's base offset 5019648 > at kafka.log.Log.nonActiveLogSegmentsFrom(Log.scala:1933) > at > kafka.log.LogCleanerManager$.maxCompactionDelay(LogCleanerManager.scala:491) > at > kafka.log.LogCleanerManager.$anonfun$grabFilthiestCompactedLog$4(LogCleanerManager.scala:184) > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) > at scala.collection.immutable.List.foreach(List.scala:392) > at scala.collection.TraversableLike.map(TraversableLike.scala:238) > at scala.collection.TraversableLike.map$(TraversableLike.scala:231) > at scala.collection.immutable.List.map(List.scala:298) > at > kafka.log.LogCleanerManager.$anonfun$grabFilthiestCompactedLog$1(LogCleanerManager.scala:181) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > at > kafka.log.LogCleanerManager.grabFilthiestCompactedLog(LogCleanerManager.scala:171) > at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:321) > ... 2 more > [2019-11-02 11:59:59,200] INFO [kafka-log-cleaner-thread-0]: Stopped > (kafka.log.LogCleaner) > {code} > If I try to ressurect it by dynamically bumping {{log.cleaner.threads}} it > instantly dies with the exact same error. > Not sure if this is something KAFKA-8725 is supposed to address. -- This message was sent by Atlassian Jira (v8.3.4#803005)