[ 
https://issues.apache.org/jira/browse/KAFKA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440212#comment-16440212
 ] 

Jeff Widman edited comment on KAFKA-6266 at 4/17/18 1:07 AM:
-------------------------------------------------------------

I am hitting this after upgrading a 0.10.0.1 broker to 1.0.1. 

the __consumer_offsets topic has the following config:
{code}
Topic:__consumer_offsets        PartitionCount:157      ReplicationFactor:2     
Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
{code}

Ignore the non-standard partition count, that is a byproduct of a bug from 
years ago. In this case, I think the only effect is that it makes it less 
likely that a partition within the __consumer_offsets topic gets produced to, 
which it sounds like would clear this error.

As described above, the external symptoms were a zero-byte log file that has a 
name like 00000000000000012526.log. 

Since this particular cluster has significantly more partitions in 
__consumer_offsets than it does consumer groups, it will not clear the error 
anytime soon because no consumer groups offsets are being hashed onto the 
problem partitions.

So to get out of the situation, I shutdown all brokers that have replicas of 
the partition, then deleted the logfiles for that partition, then restarted the 
brokers. This cleared the filename so that it matched the zero-byte contents.

Note that doing this in production will require downtime as you are taking a 
partition in the __consumer_offsets topic completely offline. On the flip side, 
you are only likely to hit this on somewhat underloaded clusters that can 
likely afford downtime... typically busy production clusters will clear 
themselves automatically through consumer groups producing to this partition.


was (Author: jeffwidman):
I am hitting this after upgrading a 0.10.0.1 broker to 1.0.1. 

the __consumer_offsets topic has the following config:
{code}
Topic:__consumer_offsets        PartitionCount:157      ReplicationFactor:2     
Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
{code}

Ignore the non-standard partition count, that is a byproduct of a bug from 
years ago. In this case, I think the only effect is that it makes it less 
likely that a partition within the __consumer_offsets topic gets produced to, 
which it sounds like would clear this error.

> Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of 
> __consumer_offsets-xx to log start offset 203569 since the checkpointed 
> offset 120955 is invalid. (kafka.log.LogCleanerManager$)
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-6266
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6266
>             Project: Kafka
>          Issue Type: Bug
>          Components: offset manager
>    Affects Versions: 1.0.0
>         Environment: CentOS 7, Apache kafka_2.12-1.0.0
>            Reporter: VinayKumar
>            Priority: Major
>
> I upgraded Kafka from 0.10.2.1 to 1.0.0 version. From then, I see the below 
> warnings in the log.
>  I'm seeing these continuously in the log, and want these to be fixed- so 
> that they wont repeat. Can someone please help me in fixing the below 
> warnings.
> {code}
> WARN Resetting first dirty offset of __consumer_offsets-17 to log start 
> offset 3346 since the checkpointed offset 3332 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-23 to log start 
> offset 4 since the checkpointed offset 1 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-19 to log start 
> offset 203569 since the checkpointed offset 120955 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-35 to log start 
> offset 16957 since the checkpointed offset 7 is invalid. 
> (kafka.log.LogCleanerManager$)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to