[jira] [Comment Edited] (KAFKA-13077) Replication failing after unclean shutdown of ZK and all brokers

Shivakumar Kedlaya (Jira) Tue, 07 Dec 2021 04:06:07 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454611#comment-17454611
 ]


Shivakumar Kedlaya edited comment on KAFKA-13077 at 12/7/21, 12:05 PM:
-----------------------------------------------------------------------

[~causton] we have a similar issue in our cluster
did you get any temporary solution as of now?
[~junrao] can you please support us here


was (Author: shivakumar):
[~causton] we have a similar issue in our cluster
did you get any temporary solution as of now?
[~junrao] could you please support here

> Replication failing after unclean shutdown of ZK and all brokers
> ----------------------------------------------------------------
>
>                 Key: KAFKA-13077
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13077
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Christopher Auston
>            Priority: Minor
>
> I am submitting this in the spirit of what can go wrong when an operator 
> violates the constraints Kafka depends on. I don't know if Kafka could or 
> should handle this more gracefully. I decided to file this issue because it 
> was easy to get the problem I'm reporting with Kubernetes StatefulSets (STS). 
> By "easy" I mean that I did not go out of my way to corrupt anything, I just 
> was not careful when restarting ZK and brokers.
> I violated the constraints of keeping Zookeeper stable and at least one 
> running in-sync replica. 
> I am running the bitnami/kafka helm chart on Amazon EKS.
> {quote}% kubectl get po kaf-kafka-0 -ojson |jq .spec.containers'[].image'
> "docker.io/bitnami/kafka:2.8.0-debian-10-r43"
> {quote}
> I started with 3 ZK instances and 3 brokers (both STS). I changed the 
> cpu/memory requests on both STS and kubernetes proceeded to restart ZK and 
> kafka instances at the same time. If I recall correctly there were some 
> crashes and several restarts but eventually all the instances were running 
> again. It's possible all ZK nodes and all brokers were unavailable at various 
> points.
> The problem I noticed was that two of the brokers were just continually 
> spitting out messages like:
> {quote}% kubectl logs kaf-kafka-0 --tail 10
> [2021-07-13 14:26:08,871] INFO [ProducerStateManager 
> partition=__transaction_state-0] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-0/00000000000000000001.snapshot,1)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,871] WARN [Log partition=__transaction_state-0, 
> dir=/bitnami/kafka/data] *Non-monotonic update of high watermark from 
> (offset=2744 segment=[0:1048644]) to (offset=1 segment=[0:169])* 
> (kafka.log.Log)
> [2021-07-13 14:26:08,874] INFO [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Truncating to offset 2 (kafka.log.Log)
> [2021-07-13 14:26:08,877] INFO [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Loading producer state till offset 2 with message 
> format version 2 (kafka.log.Log)
> [2021-07-13 14:26:08,877] INFO [ProducerStateManager 
> partition=__transaction_state-10] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-10/00000000000000000002.snapshot,2)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,877] WARN [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Non-monotonic update of high watermark from 
> (offset=2930 segment=[0:1048717]) to (offset=2 segment=[0:338]) 
> (kafka.log.Log)
> [2021-07-13 14:26:08,880] INFO [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Truncating to offset 1 (kafka.log.Log)
> [2021-07-13 14:26:08,882] INFO [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Loading producer state till offset 1 with message 
> format version 2 (kafka.log.Log)
> [2021-07-13 14:26:08,882] INFO [ProducerStateManager 
> partition=__transaction_state-20] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-20/00000000000000000001.snapshot,1)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,883] WARN [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Non-monotonic update of high watermark from 
> (offset=2956 segment=[0:1048608]) to (offset=1 segment=[0:169]) 
> (kafka.log.Log)
> {quote}
> If I describe that topic I can see that several partitions have a leader of 2 
> and the ISR is just 2 (NOTE I added two more brokers and tried to reassign 
> the topic onto brokers 2,3,4 which you can see below). The new brokers also 
> spit out the messages about "non-monotonic update" just like the original 
> followers. This describe output is from the following day.
> {{% kafka-topics.sh ${=BS} -topic __transaction_state -describe}}
> {{Topic: __transaction_state TopicId: i7bBNCeuQMWl-ZMpzrnMAw PartitionCount: 
> 50 ReplicationFactor: 3 Configs: 
> compression.type=uncompressed,min.insync.replicas=3,cleanup.policy=compact,flush.ms=1000,segment.bytes=104857600,flush.messages=10000,max.message.bytes=1000012,unclean.leader.election.enable=false,retention.bytes=1073741824}}
> {{ Topic: __transaction_state Partition: 0 Leader: 2 Replicas: 4,3,2,1,0 Isr: 
> 2 Adding Replicas: 4,3 Removing Replicas: 1,0}}
> {{ Topic: __transaction_state Partition: 1 Leader: 2 Replicas: 2,4,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 2 Leader: 3 Replicas: 3,2,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 3 Leader: 4 Replicas: 4,2,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 4 Leader: 2 Replicas: 2,3,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 5 Leader: 2 Replicas: 3,4,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 6 Leader: 4 Replicas: 4,3,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 7 Leader: 2 Replicas: 2,4,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 8 Leader: 2 Replicas: 3,2,4,0,1 Isr: 
> 2 Adding Replicas: 3,4 Removing Replicas: 0,1}}
> {{ Topic: __transaction_state Partition: 9 Leader: 2 Replicas: 4,2,3,1,0 Isr: 
> 2 Adding Replicas: 4,3 Removing Replicas: 1,0}}
> {{ Topic: __transaction_state Partition: 10 Leader: 2 Replicas: 2,3,4,1,0 
> Isr: 2 Adding Replicas: 3,4 Removing Replicas: 1,0}}
> {{ Topic: __transaction_state Partition: 11 Leader: 3 Replicas: 3,4,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 12 Leader: 4 Replicas: 4,3,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 13 Leader: 2 Replicas: 2,4,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 14 Leader: 3 Replicas: 3,2,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 15 Leader: 4 Replicas: 4,2,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 16 Leader: 2 Replicas: 2,3,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 17 Leader: 2 Replicas: 3,4,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 18 Leader: 4 Replicas: 4,3,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 19 Leader: 2 Replicas: 2,4,3,0,1 
> Isr: 2 Adding Replicas: 4,3 Removing Replicas: 0,1}}
> {{ Topic: __transaction_state Partition: 20 Leader: 2 Replicas: 3,2,4,0,1 
> Isr: 2 Adding Replicas: 3,4 Removing Replicas: 0,1}}
> {{ Topic: __transaction_state Partition: 21 Leader: 2 Replicas: 4,2,3,1,0 
> Isr: 2 Adding Replicas: 4,3 Removing Replicas: 1,0}}
> {{ Topic: __transaction_state Partition: 22 Leader: 2 Replicas: 2,3,4,1,0 
> Isr: 2 Adding Replicas: 3,4 Removing Replicas: 1,0}}
> {{ Topic: __transaction_state Partition: 23 Leader: 3 Replicas: 3,4,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 24 Leader: 4 Replicas: 4,3,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 25 Leader: 2 Replicas: 2,4,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 26 Leader: 3 Replicas: 3,2,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 27 Leader: 4 Replicas: 4,2,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 28 Leader: 2 Replicas: 2,3,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 29 Leader: 3 Replicas: 3,4,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 30 Leader: 4 Replicas: 4,3,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 31 Leader: 2 Replicas: 2,4,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 32 Leader: 3 Replicas: 3,2,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 33 Leader: 4 Replicas: 4,2,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 34 Leader: 2 Replicas: 2,3,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 35 Leader: 3 Replicas: 3,4,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 36 Leader: 4 Replicas: 4,3,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 37 Leader: 2 Replicas: 2,4,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 38 Leader: 3 Replicas: 3,2,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 39 Leader: 4 Replicas: 4,2,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 40 Leader: 2 Replicas: 2,3,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 41 Leader: 3 Replicas: 3,4,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 42 Leader: 4 Replicas: 4,3,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 43 Leader: 2 Replicas: 2,4,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 44 Leader: 3 Replicas: 3,2,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 45 Leader: 4 Replicas: 4,2,3 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 46 Leader: 2 Replicas: 2,3,4 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 47 Leader: 3 Replicas: 3,4,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 48 Leader: 4 Replicas: 4,3,2 Isr: 
> 2,3,4}}
> {{ Topic: __transaction_state Partition: 49 Leader: 2 Replicas: 2,4,3 Isr: 
> 2,3,4}}
>  
> It seems something got corrupted and the followers will never make progress. 
> Even worse the original followers appear to have truncated their copies, so 
> if the remaining leader replica is what is corrupted then it may have 
> truncated replicas that had more valid data?
> Anyway, for what it's worth, this is something that happened to me. I plan to 
> change the statefulsets to require manual restarts so I can control rolling 
> upgrades. It also seems to underscore having a separate Kafka cluster for 
> disaster recovery.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (KAFKA-13077) Replication failing after unclean shutdown of ZK and all brokers

Reply via email to