We ran into this as well, and I ended up with the following that works for us.
log.cleaner.dedupe.buffer.size=536870912 log.cleaner.io.buffer.size=20000000 On 13/07/2016 14:01, "Lawrence Weikum" <lwei...@pandora.com> wrote: >Apologies. Here is the full trace from a broker: > >[2016-06-24 09:57:39,881] ERROR [kafka-log-cleaner-thread-0], Error due to >(kafka.log.LogCleaner) >java.lang.IllegalArgumentException: requirement failed: 9730197928 messages in >segment __consumer_offsets-36/00000000000000000000.log but offset map can fit >only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease >log.cleaner.threads > at scala.Predef$.require(Predef.scala:219) > at > kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:584) > at > kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:580) > at > scala.collection.immutable.Stream$StreamWithFilter.foreach(Stream.scala:570) > at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:580) > at kafka.log.Cleaner.clean(LogCleaner.scala:322) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:230) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:208) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) >[2016-06-24 09:57:39,881] INFO [kafka-log-cleaner-thread-0], Stopped >(kafka.log.LogCleaner) > > >Is log.cleaner.dedupe.buffer.size a broker setting? What is a good number to >set it to? > > > >Lawrence Weikum > > >On 7/13/16, 11:18 AM, "Manikumar Reddy" <manikumar.re...@gmail.com> wrote: > >Can you post the complete error stack trace? Yes, you need to >restart the affected >brokers. >You can tweak log.cleaner.dedupe.buffer.size, log.cleaner.io.buffer.size >configs. > >Some related JIRAs: > >https://issues.apache.org/jira/browse/KAFKA-3587 >https://issues.apache.org/jira/browse/KAFKA-3894 >https://issues.apache.org/jira/browse/KAFKA-3915 > >On Wed, Jul 13, 2016 at 10:36 PM, Lawrence Weikum <lwei...@pandora.com> >wrote: > >> Oh interesting. I didn’t know about that log file until now. >> >> The only error that has been populated among all brokers showing this >> behavior is: >> >> ERROR [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) >> >> Then we see many messages like this: >> >> INFO Compaction for partition [__consumer_offsets,30] is resumed >> (kafka.log.LogCleaner) >> INFO The cleaning for partition [__consumer_offsets,30] is aborted >> (kafka.log.LogCleaner) >> >> Using Visual VM, I do not see any log-cleaner threads in those brokers. I >> do see it in the brokers not showing this behavior though. >> >> Any idea why the LogCleaner failed? >> >> As a temporary fix, should we restart the affected brokers? >> >> Thanks again! >> >> >> Lawrence Weikum >> >> On 7/13/16, 10:34 AM, "Manikumar Reddy" <manikumar.re...@gmail.com> wrote: >> >> Hi, >> >> Are you seeing any errors in log-cleaner.log? The log-cleaner thread can >> crash on certain errors. >> >> Thanks >> Manikumar >> >> On Wed, Jul 13, 2016 at 9:54 PM, Lawrence Weikum <lwei...@pandora.com> >> wrote: >> >> > Hello, >> > >> > We’re seeing a strange behavior in Kafka 0.9.0.1 which occurs about every >> > other week. I’m curious if others have seen it and know of a solution. >> > >> > Setup and Scenario: >> > >> > - Brokers initially setup with log compaction turned off >> > >> > - After 30 days, log compaction was turned on >> > >> > - At this time, the number of Open FDs was ~ 30K per broker. >> > >> > - After 2 days, the __consumer_offsets topic was compacted >> > fully. Open FDs reduced to ~5K per broker. >> > >> > - Cluster has been under normal load for roughly 7 days. >> > >> > - At the 7 day mark, __consumer_offsets topic seems to have >> > stopped compacting on two of the brokers, and on those brokers, the FD >> > count is up to ~25K. >> > >> > >> > We have tried rebalancing the partitions before. The first time, the >> > destination broker had compacted the data fine and open FDs were low. The >> > second time, the destination broker kept the FDs open. >> > >> > >> > In all the broker logs, we’re seeing this messages: >> > INFO [Group Metadata Manager on Broker 8]: Removed 0 expired offsets in 0 >> > milliseconds. (kafka.coordinator.GroupMetadataManager) >> > >> > There are only 4 consumers at the moment on the cluster; one topic with >> 92 >> > partitions. >> > >> > Is there a reason why log compaction may stop working or why the >> > __consumer_offsets topic would start holding thousands of FDs? >> > >> > Thank you all for your help! >> > >> > Lawrence Weikum >> > >> > >> >> >> > >