We are running Kafka-0.9 and I am seeing large __consumer_offsets on some of the partitions of the order of 100GB or more. I see some of the log and index files are more than a year old. I see the following properties that are of interest.
offsets.retention.minutes=5769 (4 Days) log.cleaner.dedupe.buffer.size=256000000 (256MB) num.recovery.threads.per.data.dir=4 log.cleaner.enable=true log.cleaner.threads=1 Upon restarting of the broker, I see the below exception which clearly indicates a problem with dedupe buffer size. However, I see the dedupe buffer size is set to 256MB which is far more than what the log complains about (37MB). What could be the problem here? How can I get the offsets topic size under manageable size? 2018-01-15 21:26:51,434 ERROR kafka.log.LogCleaner: [kafka-log-cleaner-thread-0], Error due to java.lang.IllegalArgumentException: requirement failed: 990238234 messages in segment __consumer_offsets-33/00000000000000000000.log but offset map can fit only 37499999. You can increase log.cleaner.dedupe.buffer.size or decrease log.cleaner.threads at scala.Predef$.require(Predef.scala:219) at kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:591) at kafka.log.Cleaner$$anonfun$buildOffsetMap$4.apply(LogCleaner.scala:587) at scala.collection.immutable.Stream$StreamWithFilter.foreach(Stream.scala:570) at kafka.log.Cleaner.buildOffsetMap(LogCleaner.scala:587) at kafka.log.Cleaner.clean(LogCleaner.scala:329) at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:237) at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:215) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) 2018-01-15 21:26:51,436 INFO kafka.log.LogCleaner: [kafka-log-cleaner-thread-0], Stopped Thanks, -SK