Try increasing log cleaner threads. On Tue, Jul 19, 2016 at 1:40 AM, Lawrence Weikum <lwei...@pandora.com> wrote:
> It seems that the log-cleaner is still failing no matter what settings I > give it. > > Here is the full output from one of our brokers: > [2016-07-18 13:00:40,726] ERROR [kafka-log-cleaner-thread-0], Error due > to (kafka.log.LogCleaner) > java.lang.IllegalArgumentException: requirement failed: 192053210 messages > in segment __consumer_offsets-15/00000000000000000000.log but offset map > can fit only 74999999. You can increase log.cleaner.dedupe.buffer.size or > decrease log.cleaner.threads > at scala.Predef$.require(Predef.scala:219) > at > kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:584) > at > kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:580) > at > scala.collection.immutable.Stream$StreamWithFilter.foreach(Stream.scala:570) > at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:580) > at kafka.log.Cleaner.clean(LogCleaner.scala:322) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:230) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:208) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > [2016-07-18 13:00:40,732] INFO [kafka-log-cleaner-thread-0], Stopped > (kafka.log.LogCleaner) > > Currently, I have heap allocation up to 64GB, only one log-cleaning thread > is set to run, and log.cleaner.dedupe.buffer.size is 2GB. I get this error > if I try to increase it anymore: > > WARN [kafka-log-cleaner-thread-0], Cannot use more than 2G of cleaner > buffer space per cleaner thread, ignoring excess buffer space... > (kafka.log.LogCleaner) > > Is there something else I can do to help the broker compact the > __consumer_offset topics? > > Thank you again for your help! > > Lawrence Weikum > > On 7/13/16, 1:06 PM, "Rakesh Vidyadharan" <rvidyadha...@gracenote.com> > wrote: > > We ran into this as well, and I ended up with the following that works for > us. > > log.cleaner.dedupe.buffer.size=536870912 > log.cleaner.io.buffer.size=20000000 > > > > > > On 13/07/2016 14:01, "Lawrence Weikum" <lwei...@pandora.com> wrote: > > >Apologies. Here is the full trace from a broker: > > > >[2016-06-24 09:57:39,881] ERROR [kafka-log-cleaner-thread-0], Error due > to (kafka.log.LogCleaner) > >java.lang.IllegalArgumentException: requirement failed: 9730197928 > messages in segment __consumer_offsets-36/00000000000000000000.log but > offset map can fit only 5033164. You can increase > log.cleaner.dedupe.buffer.size or decrease log.cleaner.threads > > at scala.Predef$.require(Predef.scala:219) > > at > kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:584) > > at > kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:580) > > at > scala.collection.immutable.Stream$StreamWithFilter.foreach(Stream.scala:570) > > at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:580) > > at kafka.log.Cleaner.clean(LogCleaner.scala:322) > > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:230) > > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:208) > > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > >[2016-06-24 09:57:39,881] INFO [kafka-log-cleaner-thread-0], Stopped > (kafka.log.LogCleaner) > > > > > >Is log.cleaner.dedupe.buffer.size a broker setting? What is a good > number to set it to? > > > > > > > >Lawrence Weikum > > > > > >On 7/13/16, 11:18 AM, "Manikumar Reddy" <manikumar.re...@gmail.com> > wrote: > > > >Can you post the complete error stack trace? Yes, you need to > >restart the affected > >brokers. > >You can tweak log.cleaner.dedupe.buffer.size, log.cleaner.io.buffer.size > >configs. > > > >Some related JIRAs: > > > >https://issues.apache.org/jira/browse/KAFKA-3587 > >https://issues.apache.org/jira/browse/KAFKA-3894 > >https://issues.apache.org/jira/browse/KAFKA-3915 > > > >On Wed, Jul 13, 2016 at 10:36 PM, Lawrence Weikum <lwei...@pandora.com> > >wrote: > > > >> Oh interesting. I didn’t know about that log file until now. > >> > >> The only error that has been populated among all brokers showing this > >> behavior is: > >> > >> ERROR [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) > >> > >> Then we see many messages like this: > >> > >> INFO Compaction for partition [__consumer_offsets,30] is resumed > >> (kafka.log.LogCleaner) > >> INFO The cleaning for partition [__consumer_offsets,30] is aborted > >> (kafka.log.LogCleaner) > >> > >> Using Visual VM, I do not see any log-cleaner threads in those > brokers. I > >> do see it in the brokers not showing this behavior though. > >> > >> Any idea why the LogCleaner failed? > >> > >> As a temporary fix, should we restart the affected brokers? > >> > >> Thanks again! > >> > >> > >> Lawrence Weikum > >> > >> On 7/13/16, 10:34 AM, "Manikumar Reddy" <manikumar.re...@gmail.com> > wrote: > >> > >> Hi, > >> > >> Are you seeing any errors in log-cleaner.log? The log-cleaner thread > can > >> crash on certain errors. > >> > >> Thanks > >> Manikumar > >> > >> On Wed, Jul 13, 2016 at 9:54 PM, Lawrence Weikum <lwei...@pandora.com> > >> wrote: > >> > >> > Hello, > >> > > >> > We’re seeing a strange behavior in Kafka 0.9.0.1 which occurs about > every > >> > other week. I’m curious if others have seen it and know of a > solution. > >> > > >> > Setup and Scenario: > >> > > >> > - Brokers initially setup with log compaction turned off > >> > > >> > - After 30 days, log compaction was turned on > >> > > >> > - At this time, the number of Open FDs was ~ 30K per broker. > >> > > >> > - After 2 days, the __consumer_offsets topic was compacted > >> > fully. Open FDs reduced to ~5K per broker. > >> > > >> > - Cluster has been under normal load for roughly 7 days. > >> > > >> > - At the 7 day mark, __consumer_offsets topic seems to have > >> > stopped compacting on two of the brokers, and on those brokers, the FD > >> > count is up to ~25K. > >> > > >> > > >> > We have tried rebalancing the partitions before. The first time, the > >> > destination broker had compacted the data fine and open FDs were low. > The > >> > second time, the destination broker kept the FDs open. > >> > > >> > > >> > In all the broker logs, we’re seeing this messages: > >> > INFO [Group Metadata Manager on Broker 8]: Removed 0 expired offsets > in 0 > >> > milliseconds. (kafka.coordinator.GroupMetadataManager) > >> > > >> > There are only 4 consumers at the moment on the cluster; one topic > with > >> 92 > >> > partitions. > >> > > >> > Is there a reason why log compaction may stop working or why the > >> > __consumer_offsets topic would start holding thousands of FDs? > >> > > >> > Thank you all for your help! > >> > > >> > Lawrence Weikum > >> > > >> > > >> > >> > >> > > > > > > >