Just wanted to close the loop on this. It seems the consumer offset logs might 
have been corrupted by the system restart. Deleting the topic logs and 
restarting the Kafka service cleared up the problem.

Thanks,
Dave



On 1/12/17, 2:29 PM, "Dave Hamilton" <dhamil...@nanigans.com> wrote:

    Hello, we ran into a memory issue on a Kafka 0.10.0.1 broker we are running 
that required a system restart. Since bringing Kafka back up it seems the 
consumers are having issues finding their coordinators. Here are some errors 
we’ve seen in our server logs after restarting:
    
    [2017-01-12 19:02:10,178] ERROR [Group Metadata Manager on Broker 0]: Error 
in loading offsets from [__consumer_offsets,40] 
(kafka.coordinator.GroupMetadataManager)
    java.nio.channels.ClosedChannelException
                    at 
sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99)
                    at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678)
                    at 
kafka.log.FileMessageSet.searchFor(FileMessageSet.scala:135)
                    at 
kafka.log.LogSegment.translateOffset(LogSegment.scala:106)
                    at kafka.log.LogSegment.read(LogSegment.scala:127)
                    at kafka.log.Log.read(Log.scala:532)
                    at 
kafka.coordinator.GroupMetadataManager$$anonfun$kafka$coordinator$GroupMetadataManager$$loadGroupsAndOffsets$1$1.apply$mcV$sp(GroupMetadataManager.scala:380)
                    at 
kafka.coordinator.GroupMetadataManager$$anonfun$kafka$coordinator$GroupMetadataManager$$loadGroupsAndOffsets$1$1.apply(GroupMetadataManager.scala:374)
                    at 
kafka.coordinator.GroupMetadataManager$$anonfun$kafka$coordinator$GroupMetadataManager$$loadGroupsAndOffsets$1$1.apply(GroupMetadataManager.scala:374)
                    at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:231)
                    at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:239)
                    at 
kafka.coordinator.GroupMetadataManager.kafka$coordinator$GroupMetadataManager$$loadGroupsAndOffsets$1(GroupMetadataManager.scala:374)
                    at 
kafka.coordinator.GroupMetadataManager$$anonfun$loadGroupsForPartition$1.apply$mcV$sp(GroupMetadataManager.scala:353)
                    at 
kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
                    at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:56)
                    at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
                    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
                    at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
                    at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
                    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                    at java.lang.Thread.run(Thread.java:744)
    [2017-01-12 19:03:56,468] ERROR [KafkaApi-0] Error when handling request 
{topics=[__consumer_offsets]} (kafka.server.KafkaApis)
    kafka.admin.AdminOperationException: replication factor: 1 larger than 
available brokers: 0
                    at 
kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:117)
                    at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:403)
                    at 
kafka.server.KafkaApis.kafka$server$KafkaApis$$createTopic(KafkaApis.scala:629)
                    at 
kafka.server.KafkaApis.kafka$server$KafkaApis$$createGroupMetadataTopic(KafkaApis.scala:651)
                    at 
kafka.server.KafkaApis$$anonfun$29.apply(KafkaApis.scala:668)
                    at 
kafka.server.KafkaApis$$anonfun$29.apply(KafkaApis.scala:666)
                    at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
                    at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
                    at scala.collection.immutable.Set$Set1.foreach(Set.scala:94)
                    at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
                    at 
scala.collection.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:47)
                    at scala.collection.SetLike$class.map(SetLike.scala:92)
                    at scala.collection.AbstractSet.map(Set.scala:47)
                    at 
kafka.server.KafkaApis.getTopicMetadata(KafkaApis.scala:666)
                    at 
kafka.server.KafkaApis.handleTopicMetadataRequest(KafkaApis.scala:727)
                    at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
                    at 
kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
                    at java.lang.Thread.run(Thread.java:744)
    
    Also running the kafka-consumer-groups.sh on a consumer group returns the 
following:
    
    Error while executing consumer group command This is not the correct 
coordinator for this group.
    org.apache.kafka.common.errors.NotCoordinatorForGroupException: This is not 
the correct coordinator for this group.
    
    We also see the following logs when trying to restart a Kafka connector:
    
    [2017-01-12 17:44:07,941] INFO Discovered coordinator 
lxskfkdal501.nanigans.com:9092 (id: 2147483647 rack: null) for group 
connect-paid_events_s3. 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:505)
    [2017-01-12 17:44:07,941] INFO (Re-)joining group connect-paid_events_s3 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:326)
    [2017-01-12 17:44:07,941] INFO Marking the coordinator 
lxskfkdal501.nanigans.com:9092 (id: 2147483647 rack: null) dead for group 
connect-paid_events_s3 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:542)
    
    Does anyone have recommendations for what we can do to recover from this 
issue?
    
    Thanks,
    Dave
    
    

Reply via email to