Mike Mintz created KAFKA-6491:
---------------------------------

             Summary: Large uncompacted __consumer_offsets files should not 
make broker restarts slow
                 Key: KAFKA-6491
                 URL: https://issues.apache.org/jira/browse/KAFKA-6491
             Project: Kafka
          Issue Type: Bug
          Components: offset manager
    Affects Versions: 0.10.2.1
            Reporter: Mike Mintz


Before discovering the {{log.cleaner.enable}} option, we had several Kafka 
brokers running 0.10.2.1 with ~500GB of __consumer_offsets files. When 
restarting the kafka process on these brokers, clients were able to 
successfully produce and consume messages, but clients failed to commit offsets 
to __consumer_offsets partitions which the restarted broker was leader for. 
Running jstack indicated the broker was spending its time in 
GroupMetadataManager.loadGroupsAndOffsets (example stack trace below).

Ideally Kafka brokers would be able to start up much faster in the presence of 
large __consumer_offsets files.

{noformat}
"group-metadata-manager-0" #57 daemon prio=5 os_prio=0 tid=0x00007ffb54ec3800 
nid=0xb9d runnable [0x00007ffa81139000]
   java.lang.Thread.State: RUNNABLE
  at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
  at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
  at sun.nio.ch.IOUtil.read(IOUtil.java:197)
  at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:741)
  at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
  at org.apache.kafka.common.utils.Utils.readFully(Utils.java:854)
  at org.apache.kafka.common.record.FileRecords.readInto(FileRecords.java:121)
  at 
kafka.coordinator.GroupMetadataManager.loadGroupsAndOffsets(GroupMetadataManager.scala:427)
  at 
kafka.coordinator.GroupMetadataManager.kafka$coordinator$GroupMetadataManager$$doLoadGroupsAndOffsets$1(GroupMetadataManager.scala:392)
  at 
kafka.coordinator.GroupMetadataManager$$anonfun$loadGroupsForPartition$1.apply$mcV$sp(GroupMetadataManager.scala:403)
  at 
kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
  at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to