[ https://issues.apache.org/jira/browse/KAFKA-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Luke Chen resolved KAFKA-16132. ------------------------------- Resolution: Duplicate > Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable > ------------------------------------------------------------------------------ > > Key: KAFKA-16132 > URL: https://issues.apache.org/jira/browse/KAFKA-16132 > Project: Kafka > Issue Type: Bug > Affects Versions: 3.7.0 > Reporter: Luke Chen > Priority: Blocker > > When upgrading from 3.6 to 3.7, we noticed that after upgrade the metadata > version, all the partitions will be reset at one time, which causes a short > period of time unavailable. This doesn't happen before. > {code:java} > [2024-01-15 20:45:19,757] INFO [BrokerMetadataPublisher id=2] Updating > metadata.version to 19 at offset OffsetAndEpoch(offset=229, epoch=2). > (kafka.server.metadata.BrokerMetadataPublisher) > [2024-01-15 20:45:29,915] INFO [ReplicaFetcherManager on broker 2] Removed > fetcher for partitions Set(t1-29, t1-25, t1-21, t1-17, t1-46, t1-13, t1-42, > t1-9, t1-38, t1-5, t1-34, t1-1, t1-30, t1-26, t1-22, t1-18, t1-47, t1-14, > t1-43, t1-10, t1-39, t1-6, t1-35, t1-2, t1-31, t1-27, t1-23, t1-19, t1-48, > t1-15, t1-44, t1-11, t1-40, t1-7, t1-36, t1-3, t1-32, t1-28, t1-24, t1-20, > t1-49, t1-16, t1-45, t1-12, t1-41, t1-8, t1-37, t1-4, t1-33, t1-0) > (kafka.server.ReplicaFetcherManager) > {code} > Complete log: > https://gist.github.com/showuon/665aa3ce6afd59097a2662f8260ecc10 > Steps: > 1. start up a 3.6 kafka cluster in KRaft with 1 broker > 2. create a topic > 3. upgrade the binary to 3.7 > 4. use kafka-features.sh to upgrade to 3.7 metadata version > 5. check the log (and metrics if interested) > Analysis: > In 3.7, we have JBOD support in KRaft, so the partitionRegistration added a > new directory field. And it causes diff found while comparing delta. We might > be able to identify this adding directory change doesn't need to reset the > leader/follower state, and just update the metadata, to avoid causing > unavailability. -- This message was sent by Atlassian Jira (v8.20.10#820010)