Apologies. Here is the full trace from a broker:

[2016-06-24 09:57:39,881] ERROR [kafka-log-cleaner-thread-0], Error due to  
(kafka.log.LogCleaner)
java.lang.IllegalArgumentException: requirement failed: 9730197928 messages in 
segment __consumer_offsets-36/00000000000000000000.log but offset map can fit 
only 5033164. You can increase log.cleaner.dedupe.buffer.size or decrease 
log.cleaner.threads
        at scala.Predef$.require(Predef.scala:219)
        at 
kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:584)
        at 
kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:580)
        at 
scala.collection.immutable.Stream$StreamWithFilter.foreach(Stream.scala:570)
        at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:580)
        at kafka.log.Cleaner.clean(LogCleaner.scala:322)
        at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:230)
        at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:208)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
[2016-06-24 09:57:39,881] INFO [kafka-log-cleaner-thread-0], Stopped  
(kafka.log.LogCleaner)


Is log.cleaner.dedupe.buffer.size a broker setting?  What is a good number to 
set it to?



Lawrence Weikum 


On 7/13/16, 11:18 AM, "Manikumar Reddy" <manikumar.re...@gmail.com> wrote:

Can you post the complete error stack trace?   Yes, you need to
restart the affected
brokers.
You can tweak log.cleaner.dedupe.buffer.size, log.cleaner.io.buffer.size
configs.

Some related JIRAs:

https://issues.apache.org/jira/browse/KAFKA-3587
https://issues.apache.org/jira/browse/KAFKA-3894
https://issues.apache.org/jira/browse/KAFKA-3915

On Wed, Jul 13, 2016 at 10:36 PM, Lawrence Weikum <lwei...@pandora.com>
wrote:

> Oh interesting. I didn’t know about that log file until now.
>
> The only error that has been populated among all brokers showing this
> behavior is:
>
> ERROR [kafka-log-cleaner-thread-0], Error due to  (kafka.log.LogCleaner)
>
> Then we see many messages like this:
>
> INFO Compaction for partition [__consumer_offsets,30] is resumed
> (kafka.log.LogCleaner)
> INFO The cleaning for partition [__consumer_offsets,30] is aborted
> (kafka.log.LogCleaner)
>
> Using Visual VM, I do not see any log-cleaner threads in those brokers.  I
> do see it in the brokers not showing this behavior though.
>
> Any idea why the LogCleaner failed?
>
> As a temporary fix, should we restart the affected brokers?
>
> Thanks again!
>
>
> Lawrence Weikum
>
> On 7/13/16, 10:34 AM, "Manikumar Reddy" <manikumar.re...@gmail.com> wrote:
>
> Hi,
>
> Are you seeing any errors in log-cleaner.log?  The log-cleaner thread can
> crash on certain errors.
>
> Thanks
> Manikumar
>
> On Wed, Jul 13, 2016 at 9:54 PM, Lawrence Weikum <lwei...@pandora.com>
> wrote:
>
> > Hello,
> >
> > We’re seeing a strange behavior in Kafka 0.9.0.1 which occurs about every
> > other week.  I’m curious if others have seen it and know of a solution.
> >
> > Setup and Scenario:
> >
> > -          Brokers initially setup with log compaction turned off
> >
> > -          After 30 days, log compaction was turned on
> >
> > -          At this time, the number of Open FDs was ~ 30K per broker.
> >
> > -          After 2 days, the __consumer_offsets topic was compacted
> > fully.  Open FDs reduced to ~5K per broker.
> >
> > -          Cluster has been under normal load for roughly 7 days.
> >
> > -          At the 7 day mark, __consumer_offsets topic seems to have
> > stopped compacting on two of the brokers, and on those brokers, the FD
> > count is up to ~25K.
> >
> >
> > We have tried rebalancing the partitions before.  The first time, the
> > destination broker had compacted the data fine and open FDs were low. The
> > second time, the destination broker kept the FDs open.
> >
> >
> > In all the broker logs, we’re seeing this messages:
> > INFO [Group Metadata Manager on Broker 8]: Removed 0 expired offsets in 0
> > milliseconds. (kafka.coordinator.GroupMetadataManager)
> >
> > There are only 4 consumers at the moment on the cluster; one topic with
> 92
> > partitions.
> >
> > Is there a reason why log compaction may stop working or why the
> > __consumer_offsets topic would start holding thousands of FDs?
> >
> > Thank you all for your help!
> >
> > Lawrence Weikum
> >
> >
>
>
>


Reply via email to