[jira] [Updated] (KAFKA-16282) Allow to get last stable offset (LSO) in kafka-get-offsets.sh
[ https://issues.apache.org/jira/browse/KAFKA-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16282: -- Description: Currently, when using `kafka-get-offsets.sh` to get the offset by time, we have these choices: {code:java} --time / timestamp of the offsets before that. -1 or latest / [Note: No offset is returned, if the -2 or earliest / timestamp greater than recently -3 or max-timestamp /committed record timestamp is -4 or earliest-local / given.] (default: latest) -5 or latest-tiered {code} For the "latest" option, it'll always return the "high watermark" because we always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It would be good if the command can support to get the last stable offset (LSO) for transaction support. That is, sending the option with *IsolationLevel.READ_COMMITTED* was: Currently, when using `kafka-get-offsets.sh` to get the offset by time, we have these choices: {code:java} --time / timestamp of the offsets before that. -1 or latest / [Note: No offset is returned, if the -2 or earliest / timestamp greater than recently -3 or max-timestamp /committed record timestamp is -4 or earliest-local / given.] (default: latest) -5 or latest-tiered {code} For the latest choice, it'll always return the "high watermark" because we always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It would be good if the command can support to get the last stable offset (LSO) for transaction support. That is, sending the option with *IsolationLevel.READ_COMMITTED* > Allow to get last stable offset (LSO) in kafka-get-offsets.sh > - > > Key: KAFKA-16282 > URL: https://issues.apache.org/jira/browse/KAFKA-16282 > Project: Kafka > Issue Type: Improvement >Reporter: Luke Chen >Assignee: Ahmed Sobeh >Priority: Major > Labels: need-kip, newbie, newbie++ > > Currently, when using `kafka-get-offsets.sh` to get the offset by time, we > have these choices: > {code:java} > --time / timestamp of the offsets before > that. > -1 or latest / [Note: No offset is returned, if > the > -2 or earliest / timestamp greater than recently > -3 or max-timestamp /committed record timestamp is > -4 or earliest-local / given.] (default: latest) > -5 or latest-tiered > {code} > For the "latest" option, it'll always return the "high watermark" because we > always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It > would be good if the command can support to get the last stable offset (LSO) > for transaction support. That is, sending the option with > *IsolationLevel.READ_COMMITTED* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16282) Allow to get last stable offset (LSO) in kafka-get-offsets.sh
[ https://issues.apache.org/jira/browse/KAFKA-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818759#comment-17818759 ] Luke Chen commented on KAFKA-16282: --- Great! Thanks [~ahmedsobeh]! Let me know if you need any help. > Allow to get last stable offset (LSO) in kafka-get-offsets.sh > - > > Key: KAFKA-16282 > URL: https://issues.apache.org/jira/browse/KAFKA-16282 > Project: Kafka > Issue Type: Improvement >Reporter: Luke Chen >Assignee: Ahmed Sobeh >Priority: Major > Labels: need-kip, newbie, newbie++ > > Currently, when using `kafka-get-offsets.sh` to get the offset by time, we > have these choices: > {code:java} > --time / timestamp of the offsets before > that. > -1 or latest / [Note: No offset is returned, if > the > -2 or earliest / timestamp greater than recently > -3 or max-timestamp /committed record timestamp is > -4 or earliest-local / given.] (default: latest) > -5 or latest-tiered > {code} > For the latest choice, it'll always return the "high watermark" because we > always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It > would be good if the command can support to get the last stable offset (LSO) > for transaction support. That is, sending the option with > *IsolationLevel.READ_COMMITTED* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16282) Allow to get last stable offset (LSO) in kafka-get-offsets.sh
[ https://issues.apache.org/jira/browse/KAFKA-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16282: -- Description: Currently, when using `kafka-get-offsets.sh` to get the offset by time, we have these choices: {code:java} --time / timestamp of the offsets before that. -1 or latest / [Note: No offset is returned, if the -2 or earliest / timestamp greater than recently -3 or max-timestamp /committed record timestamp is -4 or earliest-local / given.] (default: latest) -5 or latest-tiered {code} For the latest choice, it'll always return the "high watermark" because we always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It would be good if the command can support to get the last stable offset (LSO) for transaction support. That is, sending the option with *IsolationLevel.READ_COMMITTED* was: Currently, when using `kafka-get-offsets.sh` to get the offset by time, we have these choices: {code:java} --time / timestamp of the offsets before that. -1 or latest / [Note: No offset is returned, if the -2 or earliest / timestamp greater than recently -3 or max-timestamp /committed record timestamp is -4 or earliest-local / given.] (default: latest) -5 or latest-tiered {code} For the latest choice, it'll always return the "high watermark" because we always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It would be good if the command can support to get the last stable offset (LSO) for transaction support. > Allow to get last stable offset (LSO) in kafka-get-offsets.sh > - > > Key: KAFKA-16282 > URL: https://issues.apache.org/jira/browse/KAFKA-16282 > Project: Kafka > Issue Type: Improvement >Reporter: Luke Chen >Priority: Major > Labels: need-kip, newbie, newbie++ > > Currently, when using `kafka-get-offsets.sh` to get the offset by time, we > have these choices: > {code:java} > --time / timestamp of the offsets before > that. > -1 or latest / [Note: No offset is returned, if > the > -2 or earliest / timestamp greater than recently > -3 or max-timestamp /committed record timestamp is > -4 or earliest-local / given.] (default: latest) > -5 or latest-tiered > {code} > For the latest choice, it'll always return the "high watermark" because we > always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It > would be good if the command can support to get the last stable offset (LSO) > for transaction support. That is, sending the option with > *IsolationLevel.READ_COMMITTED* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16282) Allow to get last stable offset (LSO) in kafka-get-offsets.sh
[ https://issues.apache.org/jira/browse/KAFKA-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16282: -- Labels: need-kip newbie newbie++ (was: ) > Allow to get last stable offset (LSO) in kafka-get-offsets.sh > - > > Key: KAFKA-16282 > URL: https://issues.apache.org/jira/browse/KAFKA-16282 > Project: Kafka > Issue Type: Improvement >Reporter: Luke Chen >Priority: Major > Labels: need-kip, newbie, newbie++ > > Currently, when using `kafka-get-offsets.sh` to get the offset by time, we > have these choices: > {code:java} > --time / timestamp of the offsets before > that. > -1 or latest / [Note: No offset is returned, if > the > -2 or earliest / timestamp greater than recently > -3 or max-timestamp /committed record timestamp is > -4 or earliest-local / given.] (default: latest) > -5 or latest-tiered > {code} > For the latest choice, it'll always return the "high watermark" because we > always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It > would be good if the command can support to get the last stable offset (LSO) > for transaction support. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16282) Allow to get last stable offset (LSO) in kafka-get-offsets.sh
Luke Chen created KAFKA-16282: - Summary: Allow to get last stable offset (LSO) in kafka-get-offsets.sh Key: KAFKA-16282 URL: https://issues.apache.org/jira/browse/KAFKA-16282 Project: Kafka Issue Type: Improvement Reporter: Luke Chen Currently, when using `kafka-get-offsets.sh` to get the offset by time, we have these choices: {code:java} --time / timestamp of the offsets before that. -1 or latest / [Note: No offset is returned, if the -2 or earliest / timestamp greater than recently -3 or max-timestamp /committed record timestamp is -4 or earliest-local / given.] (default: latest) -5 or latest-tiered {code} For the latest choice, it'll always return the "high watermark" because we always send with the default option: *IsolationLevel.READ_UNCOMMITTED*. It would be good if the command can support to get the last stable offset (LSO) for transaction support. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15140) Improve TopicCommandIntegrationTest to be less flaky
[ https://issues.apache.org/jira/browse/KAFKA-15140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-15140. --- Fix Version/s: (was: 3.5.1) Resolution: Fixed > Improve TopicCommandIntegrationTest to be less flaky > > > Key: KAFKA-15140 > URL: https://issues.apache.org/jira/browse/KAFKA-15140 > Project: Kafka > Issue Type: Test > Components: unit tests >Reporter: Divij Vaidya >Assignee: Lan Ding >Priority: Minor > Labels: flaky-test, newbie > Fix For: 3.8.0 > > > *This is a good Jira for folks who are new to contributing to Kafka.* > Tests in TopicCommandIntegrationTest get flaky from time to time. The > objective of the task is to make them more robust by doing the following: > 1. Replace the usage {-}createAndWaitTopic{-}() adminClient.createTopics() > method and other places where were are creating a topic (without waiting) > with > TestUtils.createTopicWithAdmin(). The latter method already contains the > functionality to create a topic and wait for metadata to sync up. > 2. Replace the number 6 at places such as > "adminClient.createTopics( > Collections.singletonList(new NewTopic("foo_bar", 1, 6.toShort)))" with a > meaningful constant. > 3. Add logs if an assertion fails, for example, lines such as " > assertTrue(rows(0).startsWith(s"\tTopic: $testTopicName"), output)" should > have a third argument which prints the actual output printed so that we can > observe in the test logs on what was the output when assertion failed. > 4. Replace occurrences of "\n" with System.lineSeparator() which is platform > independent > 5. We should wait for reassignment to complete whenever we are re-assigning > partitions using alterconfig before we call describe to validate it. We could > use > TestUtils.waitForAllReassignmentsToComplete() > *Motivation of this task* > Try to fix the flaky test behaviour such as observed in > [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13924/5/testReport/junit/kafka.admin/TopicCommandIntegrationTest/Build___JDK_11_and_Scala_2_13___testDescribeUnderMinIsrPartitionsMixed_String__quorum_zk/] > > {noformat} > org.opentest4j.AssertionFailedError: expected: but was: > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31) > at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:180) > at > app//kafka.admin.TopicCommandIntegrationTest.testDescribeUnderMinIsrPartitionsMixed(TopicCommandIntegrationTest.scala:794){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16247) replica keep out-of-sync after migrating broker to KRaft
[ https://issues.apache.org/jira/browse/KAFKA-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16247. --- Resolution: Fixed Fixed in 3.7.0 RC4 > replica keep out-of-sync after migrating broker to KRaft > > > Key: KAFKA-16247 > URL: https://issues.apache.org/jira/browse/KAFKA-16247 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Luke Chen >Priority: Major > Attachments: KAFKA-16247.zip > > > We are deploying 3 controllers and 3 brokers, and following the steps in > [doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're > moving from "Enabling the migration on the brokers" state to "Migrating > brokers to KRaft" state, the first rolled broker becomes out-of-sync and > never become in-sync. > From the log, we can see some "reject alterPartition" errors, but it just > happen 2 times. Theoretically, the leader should add the follower into ISR > as long as the follower is fetching since we don't have client writing data. > But can't figure out why it didn't fetch. > Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494 > === > update Feb. 14 > After further investigating the logs, I think the reason why the replica is > not added into ISR is because the alterPartition request got non-retriable > error from controller: > {code:java} > Failed to alter partition to PendingExpandIsr(newInSyncReplicaId=0, > sentLeaderAndIsr=LeaderAndIsr(leader=1, leaderEpoch=4, > isrWithBrokerEpoch=List(BrokerState(brokerId=1, brokerEpoch=-1), > BrokerState(brokerId=2, brokerEpoch=-1), BrokerState(brokerId=0, > brokerEpoch=-1)), leaderRecoveryState=RECOVERED, partitionEpoch=7), > leaderRecoveryState=RECOVERED, > lastCommittedState=CommittedPartitionState(isr=Set(1, 2), > leaderRecoveryState=RECOVERED)) because the partition epoch is invalid. > Partition state may be out of sync, awaiting new the latest metadata. > (kafka.cluster.Partition) > [zk-broker-1-to-controller-alter-partition-channel-manager] > {code} > Since it's a non-retriable error, we'll keep the state as pending, and > waiting for later leaderAndISR update as described > [here|https://github.com/apache/kafka/blob/d24abe0edebad37e554adea47408c3063037f744/core/src/main/scala/kafka/cluster/Partition.scala#L1876C1-L1876C41]. > Log analysis: https://gist.github.com/showuon/5514cbb995fc2ae6acd5858f69c137bb > So the question becomes: > 1. Why does the controller increase the partition epoch? > 2. When the leader receives the leaderAndISR request from the controller, it > ignored the request because the leader epoch is identical, even though the > partition epoch is updated. Is the behavior expected? Will it impact the > alterPartition request later? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16247) replica keep out-of-sync after migrating broker to KRaft
[ https://issues.apache.org/jira/browse/KAFKA-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16247: -- Description: We are deploying 3 controllers and 3 brokers, and following the steps in [doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're moving from "Enabling the migration on the brokers" state to "Migrating brokers to KRaft" state, the first rolled broker becomes out-of-sync and never become in-sync. >From the log, we can see some "reject alterPartition" errors, but it just >happen 2 times. Theoretically, the leader should add the follower into ISR as >long as the follower is fetching since we don't have client writing data. But >can't figure out why it didn't fetch. Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494 === update Feb. 14 After further investigating the logs, I think the reason why the replica is not added into ISR is because the alterPartition request got non-retriable error from controller: {code:java} Failed to alter partition to PendingExpandIsr(newInSyncReplicaId=0, sentLeaderAndIsr=LeaderAndIsr(leader=1, leaderEpoch=4, isrWithBrokerEpoch=List(BrokerState(brokerId=1, brokerEpoch=-1), BrokerState(brokerId=2, brokerEpoch=-1), BrokerState(brokerId=0, brokerEpoch=-1)), leaderRecoveryState=RECOVERED, partitionEpoch=7), leaderRecoveryState=RECOVERED, lastCommittedState=CommittedPartitionState(isr=Set(1, 2), leaderRecoveryState=RECOVERED)) because the partition epoch is invalid. Partition state may be out of sync, awaiting new the latest metadata. (kafka.cluster.Partition) [zk-broker-1-to-controller-alter-partition-channel-manager] {code} Since it's a non-retriable error, we'll keep the state as pending, and waiting for later leaderAndISR update as described [here|https://github.com/apache/kafka/blob/d24abe0edebad37e554adea47408c3063037f744/core/src/main/scala/kafka/cluster/Partition.scala#L1876C1-L1876C41]. Log analysis: https://gist.github.com/showuon/5514cbb995fc2ae6acd5858f69c137bb So the question becomes: 1. Why does the controller increase the partition epoch? 2. When the leader receives the leaderAndISR request from the controller, it ignored the request because the leader epoch is identical, even though the partition epoch is updated. Is the behavior expected? Will it impact the alterPartition request later? was: We are deploying 3 controllers and 3 brokers, and following the steps in [doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're moving from "Enabling the migration on the brokers" state to "Migrating brokers to KRaft" state, the first rolled broker becomes out-of-sync and never become in-sync. >From the log, we can see some "reject alterPartition" errors, but it just >happen 2 times. Theoretically, the leader should add the follower into ISR as >long as the follower is fetching since we don't have client writing data. But >can't figure out why it didn't fetch. Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494 > replica keep out-of-sync after migrating broker to KRaft > > > Key: KAFKA-16247 > URL: https://issues.apache.org/jira/browse/KAFKA-16247 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Luke Chen >Priority: Major > Attachments: KAFKA-16247.zip > > > We are deploying 3 controllers and 3 brokers, and following the steps in > [doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're > moving from "Enabling the migration on the brokers" state to "Migrating > brokers to KRaft" state, the first rolled broker becomes out-of-sync and > never become in-sync. > From the log, we can see some "reject alterPartition" errors, but it just > happen 2 times. Theoretically, the leader should add the follower into ISR > as long as the follower is fetching since we don't have client writing data. > But can't figure out why it didn't fetch. > Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494 > === > update Feb. 14 > After further investigating the logs, I think the reason why the replica is > not added into ISR is because the alterPartition request got non-retriable > error from controller: > {code:java} > Failed to alter partition to PendingExpandIsr(newInSyncReplicaId=0, > sentLeaderAndIsr=LeaderAndIsr(leader=1, leaderEpoch=4, > isrWithBrokerEpoch=List(BrokerState(brokerId=1, brokerEpoch=-1), > BrokerState(brokerId=2, brokerEpoch=-1), BrokerState(brokerId=0, > brokerEpoch=-1)), leaderRecoveryState=RECOVERED, partitionEpoch=7), > leaderRecoveryState=RECOVERED, > lastCommittedState=CommittedPartitionState(isr=Set(1, 2), > leaderRecoveryState=RECOVERED)) because the partition epoch is invalid. > Partition state may be out of sync,
[jira] [Created] (KAFKA-16247) 1 replica keep out-of-sync after migrating broker to KRaft
Luke Chen created KAFKA-16247: - Summary: 1 replica keep out-of-sync after migrating broker to KRaft Key: KAFKA-16247 URL: https://issues.apache.org/jira/browse/KAFKA-16247 Project: Kafka Issue Type: Bug Reporter: Luke Chen We are deploying 3 controllers and 3 brokers, and following the steps in [doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're moving from "Enabling the migration on the brokers" state to "Migrating brokers to KRaft" state, the first rolled broker becomes out-of-sync and never become in-sync. >From the log, we can see some "reject alterPartition" errors, but it just >happen 2 times. Theoretically, the leader should add the follower into ISR as >long as the follower is fetching since we don't have client writing data. But >can't figure out why it didn't fetch. Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16247) replica keep out-of-sync after migrating broker to KRaft
[ https://issues.apache.org/jira/browse/KAFKA-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16247: -- Summary: replica keep out-of-sync after migrating broker to KRaft (was: 1 replica keep out-of-sync after migrating broker to KRaft) > replica keep out-of-sync after migrating broker to KRaft > > > Key: KAFKA-16247 > URL: https://issues.apache.org/jira/browse/KAFKA-16247 > Project: Kafka > Issue Type: Bug >Reporter: Luke Chen >Priority: Major > > We are deploying 3 controllers and 3 brokers, and following the steps in > [doc|https://kafka.apache.org/documentation/#kraft_zk_migration]. When we're > moving from "Enabling the migration on the brokers" state to "Migrating > brokers to KRaft" state, the first rolled broker becomes out-of-sync and > never become in-sync. > From the log, we can see some "reject alterPartition" errors, but it just > happen 2 times. Theoretically, the leader should add the follower into ISR > as long as the follower is fetching since we don't have client writing data. > But can't figure out why it didn't fetch. > Logs: https://gist.github.com/showuon/64c4dcecb238a317bdbdec8db17fd494 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16232) kafka hangs forever in the starting process if the authorizer future is not returned
Luke Chen created KAFKA-16232: - Summary: kafka hangs forever in the starting process if the authorizer future is not returned Key: KAFKA-16232 URL: https://issues.apache.org/jira/browse/KAFKA-16232 Project: Kafka Issue Type: Improvement Affects Versions: 3.6.1 Reporter: Luke Chen For security reason, during broker startup, we will wait until all ACL entries loaded before starting serving requests. But recently, we accidentally set standardAuthorizer to ZK broker, and then, the broker never enters RUNNING state because it's waiting for the standardAuthorizer future completion. Of course this is a human error to set the wrong configuration, but it'd be better we could handle this case better. Suggestions: 1. set timeout for authorizer future waiting (how long is long enough?) 2. add logs before and after future waiting, to allow admin to know we're waiting for the authorizer future. We can start with (2), and thinking about (1) later. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16157) Topic recreation with offline disk doesn't update leadership/shrink ISR correctly
[ https://issues.apache.org/jira/browse/KAFKA-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16157. --- Resolution: Fixed > Topic recreation with offline disk doesn't update leadership/shrink ISR > correctly > - > > Key: KAFKA-16157 > URL: https://issues.apache.org/jira/browse/KAFKA-16157 > Project: Kafka > Issue Type: Bug > Components: jbod, kraft >Affects Versions: 3.7.0 >Reporter: Gaurav Narula >Assignee: Gaurav Narula >Priority: Blocker > Fix For: 3.7.0 > > Attachments: broker.log, broker.log.1, broker.log.10, broker.log.2, > broker.log.3, broker.log.4, broker.log.5, broker.log.6, broker.log.7, > broker.log.8, broker.log.9 > > > In a cluster with 4 brokers, `broker-1..broker-4` with 2 disks `d1` and `d2` > in each broker, we perform the following operations: > > # Create a topic `foo.test` with 10 partitions and RF 4. Let's assume the > topic was created with id `rAujIqcjRbu_-E4UxgQT8Q`. > # Start a producer in the background to produce to `foo.test`. > # Break disk `d1` in `broker-1`. We simulate this by marking the log dir > read-only. > # Delete topic `foo.test` > # Recreate topic `foo.test`. Let's assume the topic was created with id > `bgdrsv-1QjCLFEqLOzVCHg`. > # Wait for 5 minutes > # Describe the recreated topic `foo.test`. > > We observe that `broker-1` is the leader and in-sync for few partitions > > > {code:java} > > Topic: foo.test TopicId: bgdrsv-1QjCLFEqLOzVCHg PartitionCount: 10 > ReplicationFactor: 4 Configs: > min.insync.replicas=1,unclean.leader.election.enable=false > Topic: foo.test Partition: 0 Leader: 101 Replicas: > 101,102,103,104 Isr: 101,102,103,104 > Topic: foo.test Partition: 1 Leader: 102 Replicas: > 102,103,104,101 Isr: 102,103,104 > Topic: foo.test Partition: 2 Leader: 103 Replicas: > 103,104,101,102 Isr: 103,104,102 > Topic: foo.test Partition: 3 Leader: 104 Replicas: > 104,101,102,103 Isr: 104,102,103 > Topic: foo.test Partition: 4 Leader: 104 Replicas: > 104,102,101,103 Isr: 104,102,103 > Topic: foo.test Partition: 5 Leader: 102 Replicas: > 102,101,103,104 Isr: 102,103,104 > Topic: foo.test Partition: 6 Leader: 101 Replicas: > 101,103,104,102 Isr: 101,103,104,102 > Topic: foo.test Partition: 7 Leader: 103 Replicas: > 103,104,102,101 Isr: 103,104,102 > Topic: foo.test Partition: 8 Leader: 101 Replicas: > 101,102,104,103 Isr: 101,102,104,103 > Topic: foo.test Partition: 9 Leader: 102 Replicas: > 102,104,103,101 Isr: 102,104,103 > {code} > > > In this example, it is the leader of partitions `0, 6 and 8`. > > Consider `foo.test-8`. It is present in the following brokers/disks: > > > {code:java} > $ fd foo.test-8 > broker-1/d1/foo.test-8/ > broker-2/d2/foo.test-8/ > broker-3/d2/foo.test-8/ > broker-4/d1/foo.test-8/{code} > > > `broker-1/d1` still refers to the topic id which is pending deletion because > the log dir is marked offline. > > > {code:java} > $ cat broker-1/d1/foo.test-8/partition.metadata > version: 0 > topic_id: rAujIqcjRbu_-E4UxgQT8Q{code} > > > However, other brokers have the correct topic-id > > > {code:java} > $ cat broker-2/d2/foo.test-8/partition.metadata > version: 0 > topic_id: bgdrsv-1QjCLFEqLOzVCHg%{code} > > > Now, let's consider `foo.test-0`. We observe that the replica isn't present > in `broker-1`: > {code:java} > $ fd foo.test-0 > broker-2/d1/foo.test-0/ > broker-3/d1/foo.test-0/ > broker-4/d2/foo.test-0/{code} > In both cases, `broker-1` shouldn't be the leader or in-sync replica for the > partitions. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16218) Partition reassignment can't complete if any target replica is out-of-sync
[ https://issues.apache.org/jira/browse/KAFKA-16218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813516#comment-17813516 ] Luke Chen commented on KAFKA-16218: --- As long as replica 2001 caught up, everything should work well, right? If so, I think that works as designed. You are welcomed to submit a KIP to improve this behavior. > Partition reassignment can't complete if any target replica is out-of-sync > -- > > Key: KAFKA-16218 > URL: https://issues.apache.org/jira/browse/KAFKA-16218 > Project: Kafka > Issue Type: Bug >Reporter: Drawxy >Priority: Major > > Assumed that there were 4 brokers (1001,2001,3001,4001) and a topic partition > _foo-0_ (replicas[1001,2001,3001], isr[1001,3001]). The replica 2001 can't > catch up and become out-of-sync due to some issue. > If we launch a partition reassinment for this _foo-0_ (the target replica > list is [1001,2001,4001]), the partition reassignment can't complete even if > the adding replica 4001 already catches up. At that time, the partition state > would be replicas[1001,2001,4001,3001] isr[1001,3001,4001]. > > The out-of-sync replica 2001 shouldn't make the partition reassignment stuck. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16214) No user info when SASL authentication failure
[ https://issues.apache.org/jira/browse/KAFKA-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813166#comment-17813166 ] Luke Chen commented on KAFKA-16214: --- PR: https://github.com/apache/kafka/pull/15280 > No user info when SASL authentication failure > - > > Key: KAFKA-16214 > URL: https://issues.apache.org/jira/browse/KAFKA-16214 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.0 >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > > When client authenticate failed, the server will log with the client IP > address only. The the IP address sometimes cannot represent a specific user, > especially if there is proxy between client and server. Ex: > {code:java} > INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Failed authentication > with /127.0.0.1 (channelId=127.0.0.1:9093-127.0.0.1:53223-5) (Authentication > failed: Invalid username or password) > (org.apache.kafka.common.network.Selector) > {code} > If there are many failed authentication log appeared in the server, it'd be > better to identify who is triggering it soon. Adding the client info to the > log is a good start. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16214) No user info when SASL authentication failure
Luke Chen created KAFKA-16214: - Summary: No user info when SASL authentication failure Key: KAFKA-16214 URL: https://issues.apache.org/jira/browse/KAFKA-16214 Project: Kafka Issue Type: Bug Affects Versions: 3.6.0 Reporter: Luke Chen Assignee: Luke Chen When client authenticate failed, the server will log with the client IP address only. The the IP address sometimes cannot represent a specific user, especially if there is proxy between client and server. Ex: {code:java} INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Failed authentication with /127.0.0.1 (channelId=127.0.0.1:9093-127.0.0.1:53223-5) (Authentication failed: Invalid username or password) (org.apache.kafka.common.network.Selector) {code} If there are many failed authentication log appeared in the server, it'd be better to identify who is triggering it soon. Adding the client info to the log is a good start. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16101) KRaft migration rollback documentation is incorrect
[ https://issues.apache.org/jira/browse/KAFKA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16101. --- Resolution: Fixed > KRaft migration rollback documentation is incorrect > --- > > Key: KAFKA-16101 > URL: https://issues.apache.org/jira/browse/KAFKA-16101 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.6.1 >Reporter: Paolo Patierno >Assignee: Colin McCabe >Priority: Blocker > Fix For: 3.7.0 > > > Hello, > I was trying the KRaft migration rollback procedure locally and I came across > a potential bug or anyway a situation where the cluster is not > usable/available for a certain amount of time. > In order to test the procedure, I start with a one broker (broker ID = 0) and > one zookeeper node cluster. Then I start the migration with a one KRaft > controller node (broker ID = 1). The migration runs fine and it reaches the > point of "dual write" state. > From this point, I try to run the rollback procedure as described in the > documentation. > As first step, this involves ... > * stopping the broker > * removing the __cluster_metadata folder > * removing ZooKeeper migration flag and controller(s) related configuration > from the broker > * restarting the broker > With the above steps done, the broker starts in ZooKeeper mode (no migration, > no KRaft controllers knowledge) and it keeps logging the following messages > in DEBUG: > {code:java} > [2024-01-08 11:51:20,608] DEBUG > [zk-broker-0-to-controller-forwarding-channel-manager]: Controller isn't > cached, looking for local metadata changes > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,608] DEBUG > [zk-broker-0-to-controller-forwarding-channel-manager]: No controller > provided, retrying after backoff > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,629] DEBUG > [zk-broker-0-to-controller-alter-partition-channel-manager]: Controller isn't > cached, looking for local metadata changes > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,629] DEBUG > [zk-broker-0-to-controller-alter-partition-channel-manager]: No controller > provided, retrying after backoff > (kafka.server.BrokerToControllerRequestThread) {code} > What's happening should be clear. > The /controller znode in ZooKeeper still reports the KRaft controller (broker > ID = 1) as controller. The broker gets it from the znode but doesn't know how > to reach it. > The issue is that until the procedure isn't fully completed with the next > steps (shutting down KRaft controller, deleting /controller znode), the > cluster is unusable. Any admin or client operation against the broker doesn't > work, just hangs, the broker doesn't reply. > Imagining this scenario to a more complex one with 10-20-50 brokers and > partitions' replicas spread across them, when the brokers are rolled one by > one (in ZK mode) reporting the above error, the topics will become not > available one after the other, until all brokers are in such a state and > nothing can work. This is because from a KRaft controller perspective (still > running), the brokers are not available anymore and the partitions' replicas > are out of sync. > Of course, as soon as you complete the rollback procedure, after deleting the > /controller znode, the brokers are able to elect a new controller among them > and everything recovers to work. > My first question ... isn't the cluster supposed to work during rollback and > being always available during the rollback when the procedure is not > completed yet? Or having the cluster not available is an assumption during > the rollback, until it's fully completed? > This "unavailability" time window could be reduced by deleting the > /controller znode before shutting down the KRaft controllers to allow the > brokers electing a new controller among them, but in this case, could there > be a race condition where KRaft controllers still running could steal > leadership again? > Or is there anything missing in the documentation maybe which is driving to > this problem? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16209) fetchSnapshot might return null if topic is created before v2.8
[ https://issues.apache.org/jira/browse/KAFKA-16209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812157#comment-17812157 ] Luke Chen commented on KAFKA-16209: --- Thanks [~goyarpit]! > fetchSnapshot might return null if topic is created before v2.8 > --- > > Key: KAFKA-16209 > URL: https://issues.apache.org/jira/browse/KAFKA-16209 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.1 >Reporter: Luke Chen >Assignee: Arpit Goyal >Priority: Major > Labels: newbie, newbie++ > > Remote log manager will fetch snapshot via ProducerStateManager > [here|https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerStateManager.java#L608], > but the snapshot map might get nothing if the topic has no snapshot created, > ex: topics before v2.8. Need to fix it to avoid NPE. > old PR: https://github.com/apache/kafka/pull/14615/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16209) fetchSnapshot might return null if topic is created before v2.8
Luke Chen created KAFKA-16209: - Summary: fetchSnapshot might return null if topic is created before v2.8 Key: KAFKA-16209 URL: https://issues.apache.org/jira/browse/KAFKA-16209 Project: Kafka Issue Type: Bug Affects Versions: 3.6.1 Reporter: Luke Chen Remote log manager will fetch snapshot via ProducerStateManager [here|https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerStateManager.java#L608], but the snapshot map might get nothing if the topic has no snapshot created, ex: topics before v2.8. Need to fix it to avoid NPE. old PR: https://github.com/apache/kafka/pull/14615/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16209) fetchSnapshot might return null if topic is created before v2.8
[ https://issues.apache.org/jira/browse/KAFKA-16209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16209: -- Labels: newbie newbie++ (was: ) > fetchSnapshot might return null if topic is created before v2.8 > --- > > Key: KAFKA-16209 > URL: https://issues.apache.org/jira/browse/KAFKA-16209 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.1 >Reporter: Luke Chen >Priority: Major > Labels: newbie, newbie++ > > Remote log manager will fetch snapshot via ProducerStateManager > [here|https://github.com/apache/kafka/blob/trunk/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerStateManager.java#L608], > but the snapshot map might get nothing if the topic has no snapshot created, > ex: topics before v2.8. Need to fix it to avoid NPE. > old PR: https://github.com/apache/kafka/pull/14615/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16204) Stray file core/00000000000000000001.snapshot created when running core tests
[ https://issues.apache.org/jira/browse/KAFKA-16204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16204: -- Labels: newbie newbie++ (was: ) > Stray file core/0001.snapshot created when running core tests > - > > Key: KAFKA-16204 > URL: https://issues.apache.org/jira/browse/KAFKA-16204 > Project: Kafka > Issue Type: Improvement > Components: core, unit tests >Reporter: Mickael Maison >Priority: Major > Labels: newbie, newbie++ > > When running the core tests I often get a file called > core/0001.snapshot created in my kafka folder. It looks like > one of the test does not clean its resources properly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13840) KafkaConsumer is unable to recover connection to group coordinator after commitOffsetsAsync exception
[ https://issues.apache.org/jira/browse/KAFKA-13840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811499#comment-17811499 ] Luke Chen commented on KAFKA-13840: --- I think it's this one: https://github.com/apache/kafka/pull/12259 . > KafkaConsumer is unable to recover connection to group coordinator after > commitOffsetsAsync exception > - > > Key: KAFKA-13840 > URL: https://issues.apache.org/jira/browse/KAFKA-13840 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 2.6.1, 3.1.0, 2.7.2, 2.8.1, 3.0.0 >Reporter: Kyle R Stehbens >Assignee: Luke Chen >Priority: Critical > Fix For: 3.2.1 > > > Hi, I've discovered an issue with the java Kafka client (consumer) whereby a > timeout or any other retry-able exception triggered during an async offset > commit, renders the client unable to recover its group co-coordinator and > leaves the client in a broken state. > > I first encountered this using v2.8.1 of the java client, and after going > through the code base for all versions of the client, have found it affects > all versions of the client from 2.6.1 onward. > I also confirmed that by rolling back to 2.5.1, the issue is not present. > > The issue stems from changes to how the FindCoordinatorResponseHandler in > 2.5.1 used to call clearFindCoordinatorFuture(); on both success and failure > here: > [https://github.com/apache/kafka/blob/0efa8fb0f4c73d92b6e55a112fa45417a67a7dc2/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L783] > > In all future version of the client this call is not made: > [https://github.com/apache/kafka/blob/839b886f9b732b151e1faeace7303c80641c08c4/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L838] > > What this results in, is when the KafkaConsumer makes a call to > coordinator.commitOffsetsAsync(...), if an error occurs such that the > coordinator is unavailable here: > [https://github.com/apache/kafka/blob/c5077c679c372589215a1b58ca84360c683aa6e8/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L1007] > > then the client will try call: > [https://github.com/apache/kafka/blob/c5077c679c372589215a1b58ca84360c683aa6e8/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L1017] > However this will never be able to succeed as it perpetually returns a > reference to a failed future: findCoordinatorFuture that is never cleared out. > > This manifests in all future calls to commitOffsetsAsync() throwing a > "coordinator unavailable" exception forever going forward after any > retry-able exception causes the coordinator to close. > Note we discovered this when we upgraded the kafka client in our Flink > consumers from 2.4.1 to 2.8.1 and subsequently needed to downgrade the > client. We noticed this occurring in our non-flink java consumers too running > 3.x client versions. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16085) remote copy lag bytes/segments metrics don't update all topic value
[ https://issues.apache.org/jira/browse/KAFKA-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16085. --- Fix Version/s: 3.7.0 Assignee: Luke Chen Resolution: Fixed > remote copy lag bytes/segments metrics don't update all topic value > --- > > Key: KAFKA-16085 > URL: https://issues.apache.org/jira/browse/KAFKA-16085 > Project: Kafka > Issue Type: Bug >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > Fix For: 3.7.0 > > > The metrics added in > [KIP-963|https://cwiki.apache.org/confluence/display/KAFKA/KIP-963%3A+Additional+metrics+in+Tiered+Storage#KIP963:AdditionalmetricsinTieredStorage-Copy] > is BrokerTopicMetrics, which means it should provide per-topic metric value > and all topics metric value. But current implementation doesn't update all > topic metric value. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16162) New created topics are unavailable after upgrading to 3.7
[ https://issues.apache.org/jira/browse/KAFKA-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811230#comment-17811230 ] Luke Chen commented on KAFKA-16162: --- You are right [~gnarula]. I agree the best fix should be re-sending the BrokerRegistration request with the same incarnation id when handling the metadata version update. I can see the current fix works well right now, but we are unsure if the logDirs in the controller view will be used for any other purpose in the future, which will cause another potential issue then. [~pprovenzano], WDYT? > New created topics are unavailable after upgrading to 3.7 > - > > Key: KAFKA-16162 > URL: https://issues.apache.org/jira/browse/KAFKA-16162 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Luke Chen >Priority: Blocker > > In 3.7, we introduced the KIP-858 JBOD feature, and the brokerRegistration > request will include the `LogDirs` fields with UUID for each log dir in each > broker. This info will be stored in the controller and used to identify if > the log dir is known and online while handling AssignReplicasToDirsRequest > [here|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L2093]. > > While upgrading from old version, the kafka cluster will run in 3.7 binary > with old metadata version, and then upgrade to newer version using > kafka-features.sh. That means, while brokers startup and send the > brokerRegistration request, it'll be using older metadata version without > `LogDirs` fields included. And it makes the controller has no log dir info > for all brokers. Later, after upgraded, if new topic is created, the flow > will go like this: > 1. Controller assign replicas and adds in metadata log > 2. brokers fetch the metadata and apply it > 3. ReplicaManager#maybeUpdateTopicAssignment will update topic assignment > 4. After sending ASSIGN_REPLICAS_TO_DIRS to controller with replica > assignment, controller will think the log dir in current replica is offline, > so triggering offline handler, and reassign leader to another replica, and > offline, until no more replicas to assign, so assigning leader to -1 (i.e. no > leader) > So, the results will be that new created topics are unavailable (with no > leader) because the controller thinks all log dir are offline. > {code:java} > lukchen@lukchen-mac kafka % bin/kafka-topics.sh --describe --topic > quickstart-events3 --bootstrap-server localhost:9092 > > Topic: quickstart-events3 TopicId: s8s6tEQyRvmjKI6ctNTgPg PartitionCount: > 3 ReplicationFactor: 3Configs: segment.bytes=1073741824 > Topic: quickstart-events3 Partition: 0Leader: none > Replicas: 7,2,6 Isr: 6 > Topic: quickstart-events3 Partition: 1Leader: none > Replicas: 2,6,7 Isr: 6 > Topic: quickstart-events3 Partition: 2Leader: none > Replicas: 6,7,2 Isr: 6 > {code} > The log snippet in the controller : > {code:java} > # handling 1st assignReplicaToDirs request > [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned > partition quickstart-events3:0 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA > (org.apache.kafka.controller.ReplicationControlManager) > [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned > partition quickstart-events3:2 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA > (org.apache.kafka.controller.ReplicationControlManager) > [2024-01-18 19:34:47,371] DEBUG [QuorumController id=1] Broker 6 assigned > partition quickstart-events3:1 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA > (org.apache.kafka.controller.ReplicationControlManager) > [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] > offline-dir-assignment: changing partition(s): quickstart-events3-0, > quickstart-events3-2, quickstart-events3-1 > (org.apache.kafka.controller.ReplicationControlManager) > [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for > quickstart-events3-0 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: > [AA, AA, AA] -> > [7K5JBERyyqFFxIXSXYluJA, AA, AA], > partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) > [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition > change PartitionChangeRecord(partitionId=0, topicId=6ZIeidfiSTWRiOAmGEwn_g, > isr=null, leader=-2, replicas=null, removingReplicas=null, > addingReplicas=null, leaderRecoveryState=-1, > directories=[7K5JBERyyqFFxIXSXYluJA, AA, > AA], eligibleLeaderReplicas=null, lastKnownELR=null) for > topic quickstart-
[jira] [Commented] (KAFKA-15776) Update delay timeout for DelayedRemoteFetch request
[ https://issues.apache.org/jira/browse/KAFKA-15776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811229#comment-17811229 ] Luke Chen commented on KAFKA-15776: --- Interesting thoughts. But I think if we allow multiple threads working for partition remote fetch, which will make things more complicated and error-prone. I think having a configurable remote fetch timeout should already mitigate the issue. WDYT? > Update delay timeout for DelayedRemoteFetch request > --- > > Key: KAFKA-15776 > URL: https://issues.apache.org/jira/browse/KAFKA-15776 > Project: Kafka > Issue Type: Task >Reporter: Kamal Chandraprakash >Assignee: Kamal Chandraprakash >Priority: Major > > We are reusing the {{fetch.max.wait.ms}} config as a delay timeout for > DelayedRemoteFetchPurgatory. {{fetch.max.wait.ms}} purpose is to wait for the > given amount of time when there is no data available to serve the FETCH > request. > {code:java} > The maximum amount of time the server will block before answering the fetch > request if there isn't sufficient data to immediately satisfy the requirement > given by fetch.min.bytes. > {code} > [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/DelayedRemoteFetch.scala#L41] > Using the same timeout in the DelayedRemoteFetchPurgatory can confuse the > user on how to configure optimal value for each purpose. Moreover, the config > is of *LOW* importance and most of the users won't configure it and use the > default value of 500 ms. > Having the delay timeout of 500 ms in DelayedRemoteFetchPurgatory can lead to > higher number of expired delayed remote fetch requests when the remote > storage have any degradation. > We should introduce one {{fetch.remote.max.wait.ms}} config (preferably > server config) to define the delay timeout for DelayedRemoteFetch requests > (or) take it from client similar to {{request.timeout.ms}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15776) Update delay timeout for DelayedRemoteFetch request
[ https://issues.apache.org/jira/browse/KAFKA-15776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810779#comment-17810779 ] Luke Chen commented on KAFKA-15776: --- [~jeqo], For [On the timeout configuration semantics] and [On the remote fetch timeout configuration], I'm +1. Are you going to open a KIP for it? About [On not interrupting the thread], could you elaborate more about it? I'm interested. Thanks. > Update delay timeout for DelayedRemoteFetch request > --- > > Key: KAFKA-15776 > URL: https://issues.apache.org/jira/browse/KAFKA-15776 > Project: Kafka > Issue Type: Task >Reporter: Kamal Chandraprakash >Assignee: Kamal Chandraprakash >Priority: Major > > We are reusing the {{fetch.max.wait.ms}} config as a delay timeout for > DelayedRemoteFetchPurgatory. {{fetch.max.wait.ms}} purpose is to wait for the > given amount of time when there is no data available to serve the FETCH > request. > {code:java} > The maximum amount of time the server will block before answering the fetch > request if there isn't sufficient data to immediately satisfy the requirement > given by fetch.min.bytes. > {code} > [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/DelayedRemoteFetch.scala#L41] > Using the same timeout in the DelayedRemoteFetchPurgatory can confuse the > user on how to configure optimal value for each purpose. Moreover, the config > is of *LOW* importance and most of the users won't configure it and use the > default value of 500 ms. > Having the delay timeout of 500 ms in DelayedRemoteFetchPurgatory can lead to > higher number of expired delayed remote fetch requests when the remote > storage have any degradation. > We should introduce one {{fetch.remote.max.wait.ms}} config (preferably > server config) to define the delay timeout for DelayedRemoteFetch requests > (or) take it from client similar to {{request.timeout.ms}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16144) Controller leader checkQuorum timer should skip only 1 controller case
[ https://issues.apache.org/jira/browse/KAFKA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16144. --- Resolution: Fixed > Controller leader checkQuorum timer should skip only 1 controller case > -- > > Key: KAFKA-16144 > URL: https://issues.apache.org/jira/browse/KAFKA-16144 > Project: Kafka > Issue Type: Bug >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Blocker > Labels: newbie, newbie++ > Fix For: 3.7.0 > > > In KAFKA-15489, we fixed the potential "split brain" issue by adding the > check quorum timer. This timer will be updated when the follower fetch > request arrived. And it expires the timer when the there are no majority of > voter followers fetch from leader, and resign the leadership. > But in KAFKA-15489, we forgot to consider the case where there's only 1 > controller node. If there's only 1 controller node (and no broker node), > there will be no fetch request arrived, so the timer will expire each time. > However, if there's only 1 node, we don't have to care about the "check > quorum" at all. We should skip the check for only 1 controller node case. > Currently, this issue will happen only when there's 1 controller node and no > any broker node (i.e. no fetch request sent to the controller). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16162) New created topics are unavailable after upgrading to 3.7
[ https://issues.apache.org/jira/browse/KAFKA-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17808578#comment-17808578 ] Luke Chen commented on KAFKA-16162: --- Thanks [~pprovenzano], I agree we also need to have a way to handle all log dirs are offline in a broker. However, in this issue, I think there's also another potential bug that why should we update topic assignment and send ASSIGN_REPLICAS_TO_DIRS in `ReplicaManager#maybeUpdateTopicAssignment`? {code:java} 1. Controller assign replicas and adds in metadata log 2. brokers fetch the metadata and apply it 3. ReplicaManager#maybeUpdateTopicAssignment will update topic assignment 4. After sending ASSIGN_REPLICAS_TO_DIRS to controller with replica assignment, controller will think the log dir in current replica is offline, so triggering offline handler, and reassign leader to another replica, and offline, until no more replicas to assign, so assigning leader to -1 (i.e. no leader) {code} The problem still lies in the brokerRegistraion having no log dir. And when the controller adds the `PartitionRegistration` record, it used the info in old `brokerRegistraion` for the broker, so it'll get nothing for the log dir. Later in the broker side, when in `ReplicaManager#maybeUpdateTopicAssignment`, we'll find the assigned log dir UUID (which is all 0 (MIGRATING)) is different from the one in logManager (the real log dir UUID). I'm not sure if we can skip the topic assignment update if the UUID is MIGRATING and we have a real UUID? I can create another ticket if you think that's a separate issue. > New created topics are unavailable after upgrading to 3.7 > - > > Key: KAFKA-16162 > URL: https://issues.apache.org/jira/browse/KAFKA-16162 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Luke Chen >Priority: Blocker > > In 3.7, we introduced the KIP-858 JBOD feature, and the brokerRegistration > request will include the `LogDirs` fields with UUID for each log dir in each > broker. This info will be stored in the controller and used to identify if > the log dir is known and online while handling AssignReplicasToDirsRequest > [here|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L2093]. > > While upgrading from old version, the kafka cluster will run in 3.7 binary > with old metadata version, and then upgrade to newer version using > kafka-features.sh. That means, while brokers startup and send the > brokerRegistration request, it'll be using older metadata version without > `LogDirs` fields included. And it makes the controller has no log dir info > for all brokers. Later, after upgraded, if new topic is created, the flow > will go like this: > 1. Controller assign replicas and adds in metadata log > 2. brokers fetch the metadata and apply it > 3. ReplicaManager#maybeUpdateTopicAssignment will update topic assignment > 4. After sending ASSIGN_REPLICAS_TO_DIRS to controller with replica > assignment, controller will think the log dir in current replica is offline, > so triggering offline handler, and reassign leader to another replica, and > offline, until no more replicas to assign, so assigning leader to -1 (i.e. no > leader) > So, the results will be that new created topics are unavailable (with no > leader) because the controller thinks all log dir are offline. > {code:java} > lukchen@lukchen-mac kafka % bin/kafka-topics.sh --describe --topic > quickstart-events3 --bootstrap-server localhost:9092 > > Topic: quickstart-events3 TopicId: s8s6tEQyRvmjKI6ctNTgPg PartitionCount: > 3 ReplicationFactor: 3Configs: segment.bytes=1073741824 > Topic: quickstart-events3 Partition: 0Leader: none > Replicas: 7,2,6 Isr: 6 > Topic: quickstart-events3 Partition: 1Leader: none > Replicas: 2,6,7 Isr: 6 > Topic: quickstart-events3 Partition: 2Leader: none > Replicas: 6,7,2 Isr: 6 > {code} > The log snippet in the controller : > {code:java} > # handling 1st assignReplicaToDirs request > [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned > partition quickstart-events3:0 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA > (org.apache.kafka.controller.ReplicationControlManager) > [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned > partition quickstart-events3:2 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA > (org.apache.kafka.controller.ReplicationControlManager) > [2024-01-18 19:34:47,371] DEBUG [QuorumController id=1] Broker 6 assigned > partition quickstart-events3:1 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA > (org.apache.kafka.controller.ReplicationControlManager) > [2024-01-18 19:34:47,372] DEBU
[jira] [Assigned] (KAFKA-16144) Controller leader checkQuorum timer should skip only 1 controller case
[ https://issues.apache.org/jira/browse/KAFKA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen reassigned KAFKA-16144: - Assignee: Luke Chen > Controller leader checkQuorum timer should skip only 1 controller case > -- > > Key: KAFKA-16144 > URL: https://issues.apache.org/jira/browse/KAFKA-16144 > Project: Kafka > Issue Type: Bug >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > Labels: newbie, newbie++ > > In KAFKA-15489, we fixed the potential "split brain" issue by adding the > check quorum timer. This timer will be updated when the follower fetch > request arrived. And it expires the timer when the there are no majority of > voter followers fetch from leader, and resign the leadership. > But in KAFKA-15489, we forgot to consider the case where there's only 1 > controller node. If there's only 1 controller node (and no broker node), > there will be no fetch request arrived, so the timer will expire each time. > However, if there's only 1 node, we don't have to care about the "check > quorum" at all. We should skip the check for only 1 controller node case. > Currently, this issue will happen only when there's 1 controller node and no > any broker node (i.e. no fetch request sent to the controller). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16163) Constant resignation/reelection of controller when starting a single node in combined mode
[ https://issues.apache.org/jira/browse/KAFKA-16163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16163. --- Resolution: Duplicate > Constant resignation/reelection of controller when starting a single node in > combined mode > -- > > Key: KAFKA-16163 > URL: https://issues.apache.org/jira/browse/KAFKA-16163 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Mickael Maison >Priority: Major > > When starting a single node in combined mode: > {noformat} > $ KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)" > $ bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c > config/kraft/server.properties > $ bin/kafka-server-start.sh config/kraft/server.properties{noformat} > > it's constantly spamming the logs with: > {noformat} > [2024-01-18 17:37:09,065] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2024-01-18 17:37:11,967] INFO [RaftManager id=1] Did not receive fetch > request from the majority of the voters within 3000ms. Current fetched voters > are []. (org.apache.kafka.raft.LeaderState) > [2024-01-18 17:37:11,967] INFO [RaftManager id=1] Completed transition to > ResignedState(localId=1, epoch=138, voters=[1], electionTimeoutMs=1864, > unackedVoters=[], preferredSuccessors=[]) from Leader(localId=1, epoch=138, > epochStartOffset=829, highWatermark=Optional[LogOffsetMetadata(offset=835, > metadata=Optional[(segmentBaseOffset=0,relativePositionInSegment=62788)])], > voterStates={1=ReplicaState(nodeId=1, > endOffset=Optional[LogOffsetMetadata(offset=835, > metadata=Optional[(segmentBaseOffset=0,relativePositionInSegment=62788)])], > lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1, > hasAcknowledgedLeader=true)}) (org.apache.kafka.raft.QuorumState) > [2024-01-18 17:37:13,072] INFO [NodeToControllerChannelManager id=1 > name=heartbeat] Client requested disconnect from node 1 > (org.apache.kafka.clients.NetworkClient) > [2024-01-18 17:37:13,072] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2024-01-18 17:37:13,123] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2024-01-18 17:37:13,124] INFO [NodeToControllerChannelManager id=1 > name=heartbeat] Client requested disconnect from node 1 > (org.apache.kafka.clients.NetworkClient) > [2024-01-18 17:37:13,124] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2024-01-18 17:37:13,175] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2024-01-18 17:37:13,176] INFO [NodeToControllerChannelManager id=1 > name=heartbeat] Client requested disconnect from node 1 > (org.apache.kafka.clients.NetworkClient) > [2024-01-18 17:37:13,176] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2024-01-18 17:37:13,227] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2024-01-18 17:37:13,229] INFO [NodeToControllerChannelManager id=1 > name=heartbeat] Client requested disconnect from node 1 > (org.apache.kafka.clients.NetworkClient) > [2024-01-18 17:37:13,229] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2024-01-18 17:37:13,279] INFO > [broker-1-to-controller-heartbeat-channel-manager]: Recorded new controller, > from now on will use node localhost:9093 (id: 1 rack: null) > (kafka.server.NodeToControllerRequestThread){noformat} > This did not happen in 3.6. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16162) New created topics are unavailable after upgrading to 3.7
[ https://issues.apache.org/jira/browse/KAFKA-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16162: -- Description: In 3.7, we introduced the KIP-858 JBOD feature, and the brokerRegistration request will include the `LogDirs` fields with UUID for each log dir in each broker. This info will be stored in the controller and used to identify if the log dir is known and online while handling AssignReplicasToDirsRequest [here|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L2093]. While upgrading from old version, the kafka cluster will run in 3.7 binary with old metadata version, and then upgrade to newer version using kafka-features.sh. That means, while brokers startup and send the brokerRegistration request, it'll be using older metadata version without `LogDirs` fields included. And it makes the controller has no log dir info for all brokers. Later, after upgraded, if new topic is created, the flow will go like this: 1. Controller assign replicas and adds in metadata log 2. brokers fetch the metadata and apply it 3. ReplicaManager#maybeUpdateTopicAssignment will update topic assignment 4. After sending ASSIGN_REPLICAS_TO_DIRS to controller with replica assignment, controller will think the log dir in current replica is offline, so triggering offline handler, and reassign leader to another replica, and offline, until no more replicas to assign, so assigning leader to -1 (i.e. no leader) So, the results will be that new created topics are unavailable (with no leader) because the controller thinks all log dir are offline. {code:java} lukchen@lukchen-mac kafka % bin/kafka-topics.sh --describe --topic quickstart-events3 --bootstrap-server localhost:9092 Topic: quickstart-events3 TopicId: s8s6tEQyRvmjKI6ctNTgPg PartitionCount: 3 ReplicationFactor: 3Configs: segment.bytes=1073741824 Topic: quickstart-events3 Partition: 0Leader: none Replicas: 7,2,6 Isr: 6 Topic: quickstart-events3 Partition: 1Leader: none Replicas: 2,6,7 Isr: 6 Topic: quickstart-events3 Partition: 2Leader: none Replicas: 6,7,2 Isr: 6 {code} The log snippet in the controller : {code:java} # handling 1st assignReplicaToDirs request [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:0 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:2 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,371] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:1 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] offline-dir-assignment: changing partition(s): quickstart-events3-0, quickstart-events3-2, quickstart-events3-1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-0 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [7K5JBERyyqFFxIXSXYluJA, AA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=0, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[7K5JBERyyqFFxIXSXYluJA, AA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-2 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [AA, 7K5JBERyyqFFxIXSXYluJA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=2, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[AA, 7K5JBERyyqFFxIXSXYluJA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01
[jira] [Updated] (KAFKA-16162) New created topics are unavailable after upgrading to 3.7
[ https://issues.apache.org/jira/browse/KAFKA-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16162: -- Description: In 3.7, we introduced the KIP-858 JBOD feature, and the brokerRegistration request will include the `LogDirs` fields with UUID for each log dir in each broker. This info will be stored in the controller and used to identify if the log dir is known and online while handling AssignReplicasToDirsRequest [here|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L2093]. While upgrading from old version, the kafka cluster will run in 3.7 binary with old metadata version, and then upgrade to newer version using kafka-features.sh. That means, while brokers startup and send the brokerRegistration request, it'll be using older metadata version without `LogDirs` fields included. And it makes the controller has no log dir info for all brokers. Later, after upgraded, if new topic is created, the flow will go like this: 1. Controller assign replicas and adds in metadata log 2. brokers fetch the metadata and apply it 3. ReplicaManager#maybeUpdateTopicAssignment will update topic assignment 4. After sending ASSIGN_REPLICAS_TO_DIRS to controller with replica assignment, controller will think the log dir in current replica is offline, so triggering offline handler, and reassign leader to another replica, and offline, until no more replicas to assign, so assigning leader to -1 (i.e. no leader) So, the results will be that new created topics are unavailable (with no leader) because the controller thinks all log dir are offline. {code:java} lukchen@lukchen-mac kafka % bin/kafka-topics.sh --describe --topic quickstart-events3 --bootstrap-server localhost:9092 Topic: quickstart-events3 TopicId: s8s6tEQyRvmjKI6ctNTgPg PartitionCount: 3 ReplicationFactor: 3Configs: segment.bytes=1073741824 Topic: quickstart-events3 Partition: 0Leader: none Replicas: 7,2,6 Isr: 6 Topic: quickstart-events3 Partition: 1Leader: none Replicas: 2,6,7 Isr: 6 Topic: quickstart-events3 Partition: 2Leader: none Replicas: 6,7,2 Isr: 6 {code} The log snippet in the controller : {code:java} # handling 1st assignReplicaToDirs request [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:0 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:2 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,371] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:1 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] offline-dir-assignment: changing partition(s): quickstart-events3-0, quickstart-events3-2, quickstart-events3-1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-0 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [7K5JBERyyqFFxIXSXYluJA, AA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=0, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[7K5JBERyyqFFxIXSXYluJA, AA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-2 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [AA, 7K5JBERyyqFFxIXSXYluJA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=2, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[AA, 7K5JBERyyqFFxIXSXYluJA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01
[jira] [Updated] (KAFKA-16162) New created topics are unavailable after upgrading to 3.7
[ https://issues.apache.org/jira/browse/KAFKA-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16162: -- Description: In 3.7, we introduced the KIP-858 JBOD feature, and the brokerRegistration request will include the `LogDirs` fields with UUID for each log dir in each broker. This info will be stored in the controller and used to identify if the log dir is known and online while handling AssignReplicasToDirsRequest [here|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L2093]. While upgrading from old version, the kafka cluster will run in 3.7 binary with old metadata version, and then upgrade to newer version using kafka-features.sh. That means, while brokers startup and send the brokerRegistration request, it'll be using older metadata version without `LogDirs` fields included. And it makes the controller has no log dir info for all brokers. Later, after upgraded, if new topic is created, the flow will go like this: 1. Controller assign replicas and adds in metadata log 2. brokers fetch the metadata and apply it 3. ReplicaManager#maybeUpdateTopicAssignment will update topic assignment 4. After sending ASSIGN_REPLICAS_TO_DIRS to controller with replica assignment, controller will think the log dir in current replica is offline, so triggering offline handler, and reassign leader to another replica, and offline, until no more replicas to assign, so assigning leader to -1 (i.e. no leader) So, the results will be that new created topics are unavailable (with no leader) because the controller thinks all log dir are offline. {code:java} lukchen@lukchen-mac kafka % bin/kafka-topics.sh --describe --topic quickstart-events3 --bootstrap-server localhost:9092 Topic: quickstart-events3 TopicId: s8s6tEQyRvmjKI6ctNTgPg PartitionCount: 3 ReplicationFactor: 3Configs: segment.bytes=1073741824 Topic: quickstart-events3 Partition: 0Leader: none Replicas: 7,2,6 Isr: 6 Topic: quickstart-events3 Partition: 1Leader: none Replicas: 2,6,7 Isr: 6 Topic: quickstart-events3 Partition: 2Leader: none Replicas: 6,7,2 Isr: 6 {code} The log snippet in the controller : {code:java} # handling 1st assignReplicaToDirs request [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:0 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:2 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,371] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:1 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] offline-dir-assignment: changing partition(s): quickstart-events3-0, quickstart-events3-2, quickstart-events3-1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-0 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [7K5JBERyyqFFxIXSXYluJA, AA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=0, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[7K5JBERyyqFFxIXSXYluJA, AA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-2 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [AA, 7K5JBERyyqFFxIXSXYluJA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=2, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[AA, 7K5JBERyyqFFxIXSXYluJA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01
[jira] [Updated] (KAFKA-16162) New created topics are unavailable after upgrading to 3.7
[ https://issues.apache.org/jira/browse/KAFKA-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16162: -- Description: In 3.7, we introduced the KIP-858 JBOD feature, and the brokerRegistration request will include the `LogDirs` fields with UUID for each log dir in each broker. This info will be stored in the controller and used to identify if the log dir is known and online while handling AssignReplicasToDirsRequest [here|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L2093]. While upgrading from old version, the kafka cluster will run in 3.7 binary with old metadata version, and then upgrade to newer version using kafka-features.sh. That means, while brokers startup and send the brokerRegistration request, it'll be using older metadata version without `LogDirs` fields included. And it makes the controller has no log dir info for all brokers. Later, after upgraded, if new topic is created, the flow will go like this: 1. Controller assign replicas and adds in metadata log 2. brokers fetch the metadata and apply it 3. ReplicaManager#maybeUpdateTopicAssignment will update topic assignment 4. After sending ASSIGN_REPLICAS_TO_DIRS to controller with replica assignment, controller will think the log dir in current replica is offline, so triggering offline handler, and reassign leader to another replica, and offline, until no more replicas to assign, so assigning leader to -1 (i.e. no leader) So, the results will be that new created topics are unavailable (with no leader) because the controller thinks all log dir are offline. {code:java} lukchen@lukchen-mac kafka % bin/kafka-topics.sh --describe --topic quickstart-events3 --bootstrap-server localhost:9092 Topic: quickstart-events3 TopicId: s8s6tEQyRvmjKI6ctNTgPg PartitionCount: 3 ReplicationFactor: 3Configs: segment.bytes=1073741824 Topic: quickstart-events3 Partition: 0Leader: none Replicas: 7,2,6 Isr: 6 Topic: quickstart-events3 Partition: 1Leader: none Replicas: 2,6,7 Isr: 6 Topic: quickstart-events3 Partition: 2Leader: none Replicas: 6,7,2 Isr: 6 {code} The log snippet in the controller : {code:java} # handling 1st assignReplicaToDirs request [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:0 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:2 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,371] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:1 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] offline-dir-assignment: changing partition(s): quickstart-events3-0, quickstart-events3-2, quickstart-events3-1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-0 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [7K5JBERyyqFFxIXSXYluJA, AA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=0, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[7K5JBERyyqFFxIXSXYluJA, AA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-2 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [AA, 7K5JBERyyqFFxIXSXYluJA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=2, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[AA, 7K5JBERyyqFFxIXSXYluJA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01
[jira] [Created] (KAFKA-16162) New created topics are unavailable after upgrading to 3.7
Luke Chen created KAFKA-16162: - Summary: New created topics are unavailable after upgrading to 3.7 Key: KAFKA-16162 URL: https://issues.apache.org/jira/browse/KAFKA-16162 Project: Kafka Issue Type: Bug Reporter: Luke Chen In 3.7, we introduced the KIP-858 JBOD feature, and the brokerRegistration request will include the `LogDirs` fields with UUID for each log dir in each broker. This info will be stored in the controller and used to identify if the log dir is known and online while handling AssignReplicasToDirsRequest [here|https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L2093]. While upgrading from old version, the kafka cluster will run in 3.7 binary with old metadata version, and then upgrade to newer version using kafka-features.sh. That means, while brokers startup and send the brokerRegistration request, it'll be using older metadata version without `LogDirs` fields included. And it makes the controller has no log dir info for all brokers. Later, after upgraded, if new topic is created, the flow will go like this: 1. Controller assign replicas and adds in metadata log 2. brokers fetch the metadata and apply it 3. ReplicaManager#maybeUpdateTopicAssignment will update topic assignment 4. After sending ASSIGN_REPLICAS_TO_DIRS to controller with replica assignment, controller will think the log dir in current replica is offline, so triggering offline handler, and reassign replica, and offline, until no more replicas to assign, so assigning leader to -1 (i.e. no leader) So, the results will be that new created topics are unavailable (with no leader) because the controller thinks all log dir are offline. {code:java} lukchen@lukchen-mac kafka % bin/kafka-topics.sh --describe --topic quickstart-events3 --bootstrap-server localhost:9092 Topic: quickstart-events3 TopicId: s8s6tEQyRvmjKI6ctNTgPg PartitionCount: 3 ReplicationFactor: 3Configs: segment.bytes=1073741824 Topic: quickstart-events3 Partition: 0Leader: none Replicas: 7,2,6 Isr: 6 Topic: quickstart-events3 Partition: 1Leader: none Replicas: 2,6,7 Isr: 6 Topic: quickstart-events3 Partition: 2Leader: none Replicas: 6,7,2 Isr: 6 {code} The log snippet in the controller : {code:java} # handling 1st assignReplicaToDirs request [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:0 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,370] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:2 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,371] DEBUG [QuorumController id=1] Broker 6 assigned partition quickstart-events3:1 to OFFLINE dir 7K5JBERyyqFFxIXSXYluJA (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] offline-dir-assignment: changing partition(s): quickstart-events3-0, quickstart-events3-2, quickstart-events3-1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-0 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [7K5JBERyyqFFxIXSXYluJA, AA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=0, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[7K5JBERyyqFFxIXSXYluJA, AA, AA], eligibleLeaderReplicas=null, lastKnownELR=null) for topic quickstart-events3 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] partition change for quickstart-events3-2 with topic ID 6ZIeidfiSTWRiOAmGEwn_g: directories: [AA, AA, AA] -> [AA, 7K5JBERyyqFFxIXSXYluJA, AA], partitionEpoch: 0 -> 1 (org.apache.kafka.controller.ReplicationControlManager) [2024-01-18 19:34:47,372] DEBUG [QuorumController id=1] Replayed partition change PartitionChangeRecord(partitionId=2, topicId=6ZIeidfiSTWRiOAmGEwn_g, isr=null, leader=-2, replicas=null, removingReplicas=null, addingReplicas=null, leaderRecoveryState=-1, directories=[AA, 7K5JBERyyqFFxIXSXYluJA, AA], eligibleLeaderReplicas=null, l
[jira] [Resolved] (KAFKA-16132) Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable
[ https://issues.apache.org/jira/browse/KAFKA-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16132. --- Resolution: Duplicate > Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable > -- > > Key: KAFKA-16132 > URL: https://issues.apache.org/jira/browse/KAFKA-16132 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Luke Chen >Priority: Blocker > > When upgrading from 3.6 to 3.7, we noticed that after upgrade the metadata > version, all the partitions will be reset at one time, which causes a short > period of time unavailable. This doesn't happen before. > {code:java} > [2024-01-15 20:45:19,757] INFO [BrokerMetadataPublisher id=2] Updating > metadata.version to 19 at offset OffsetAndEpoch(offset=229, epoch=2). > (kafka.server.metadata.BrokerMetadataPublisher) > [2024-01-15 20:45:29,915] INFO [ReplicaFetcherManager on broker 2] Removed > fetcher for partitions Set(t1-29, t1-25, t1-21, t1-17, t1-46, t1-13, t1-42, > t1-9, t1-38, t1-5, t1-34, t1-1, t1-30, t1-26, t1-22, t1-18, t1-47, t1-14, > t1-43, t1-10, t1-39, t1-6, t1-35, t1-2, t1-31, t1-27, t1-23, t1-19, t1-48, > t1-15, t1-44, t1-11, t1-40, t1-7, t1-36, t1-3, t1-32, t1-28, t1-24, t1-20, > t1-49, t1-16, t1-45, t1-12, t1-41, t1-8, t1-37, t1-4, t1-33, t1-0) > (kafka.server.ReplicaFetcherManager) > {code} > Complete log: > https://gist.github.com/showuon/665aa3ce6afd59097a2662f8260ecc10 > Steps: > 1. start up a 3.6 kafka cluster in KRaft with 1 broker > 2. create a topic > 3. upgrade the binary to 3.7 > 4. use kafka-features.sh to upgrade to 3.7 metadata version > 5. check the log (and metrics if interested) > Analysis: > In 3.7, we have JBOD support in KRaft, so the partitionRegistration added a > new directory field. And it causes diff found while comparing delta. We might > be able to identify this adding directory change doesn't need to reset the > leader/follower state, and just update the metadata, to avoid causing > unavailability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16132) Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable
[ https://issues.apache.org/jira/browse/KAFKA-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807191#comment-17807191 ] Luke Chen commented on KAFKA-16132: --- I just tried it using this branch https://github.com/apache/kafka/pull/15197 , the issue doesn't happen again. So it looks like it's caused by KAFKA-16131 . Closing this ticket. > Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable > -- > > Key: KAFKA-16132 > URL: https://issues.apache.org/jira/browse/KAFKA-16132 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Luke Chen >Priority: Blocker > > When upgrading from 3.6 to 3.7, we noticed that after upgrade the metadata > version, all the partitions will be reset at one time, which causes a short > period of time unavailable. This doesn't happen before. > {code:java} > [2024-01-15 20:45:19,757] INFO [BrokerMetadataPublisher id=2] Updating > metadata.version to 19 at offset OffsetAndEpoch(offset=229, epoch=2). > (kafka.server.metadata.BrokerMetadataPublisher) > [2024-01-15 20:45:29,915] INFO [ReplicaFetcherManager on broker 2] Removed > fetcher for partitions Set(t1-29, t1-25, t1-21, t1-17, t1-46, t1-13, t1-42, > t1-9, t1-38, t1-5, t1-34, t1-1, t1-30, t1-26, t1-22, t1-18, t1-47, t1-14, > t1-43, t1-10, t1-39, t1-6, t1-35, t1-2, t1-31, t1-27, t1-23, t1-19, t1-48, > t1-15, t1-44, t1-11, t1-40, t1-7, t1-36, t1-3, t1-32, t1-28, t1-24, t1-20, > t1-49, t1-16, t1-45, t1-12, t1-41, t1-8, t1-37, t1-4, t1-33, t1-0) > (kafka.server.ReplicaFetcherManager) > {code} > Complete log: > https://gist.github.com/showuon/665aa3ce6afd59097a2662f8260ecc10 > Steps: > 1. start up a 3.6 kafka cluster in KRaft with 1 broker > 2. create a topic > 3. upgrade the binary to 3.7 > 4. use kafka-features.sh to upgrade to 3.7 metadata version > 5. check the log (and metrics if interested) > Analysis: > In 3.7, we have JBOD support in KRaft, so the partitionRegistration added a > new directory field. And it causes diff found while comparing delta. We might > be able to identify this adding directory change doesn't need to reset the > leader/follower state, and just update the metadata, to avoid causing > unavailability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16144) Controller leader checkQuorum timer should skip only 1 controller case
Luke Chen created KAFKA-16144: - Summary: Controller leader checkQuorum timer should skip only 1 controller case Key: KAFKA-16144 URL: https://issues.apache.org/jira/browse/KAFKA-16144 Project: Kafka Issue Type: Bug Reporter: Luke Chen In KAFKA-15489, we fixed the potential "split brain" issue by adding the check quorum timer. This timer will be updated when the follower fetch request arrived. And it expires the timer when the there are no majority of voter followers fetch from leader, and resign the leadership. But in KAFKA-15489, we forgot to consider the case where there's only 1 controller node. If there's only 1 controller node (and no broker node), there will be no fetch request arrived, so the timer will expire each time. However, if there's only 1 node, we don't have to care about the "check quorum" at all. We should skip the check for only 1 controller node case. Currently, this issue will happen only when there's 1 controller node and no any broker node (i.e. no fetch request sent to the controller). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16131) Repeated UnsupportedVersionException logged when running Kafka 3.7.0-RC2 KRaft cluster with metadata version 3.6
[ https://issues.apache.org/jira/browse/KAFKA-16131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806800#comment-17806800 ] Luke Chen commented on KAFKA-16131: --- KAFKA-16132 is also a bug related to JBOD after upgrading to 3.7. FYI > Repeated UnsupportedVersionException logged when running Kafka 3.7.0-RC2 > KRaft cluster with metadata version 3.6 > > > Key: KAFKA-16131 > URL: https://issues.apache.org/jira/browse/KAFKA-16131 > Project: Kafka > Issue Type: Bug >Reporter: Jakub Scholz >Priority: Major > Fix For: 3.7.0 > > > When running Kafka 3.7.0-RC2 as a KRaft cluster with metadata version set to > 3.6-IV2 metadata version, it throws repeated errors like this in the > controller logs: > {quote}2024-01-13 16:58:01,197 INFO [QuorumController id=0] > assignReplicasToDirs: event failed with UnsupportedVersionException in 15 > microseconds. (org.apache.kafka.controller.QuorumController) > [quorum-controller-0-event-handler] > 2024-01-13 16:58:01,197 ERROR [ControllerApis nodeId=0] Unexpected error > handling request RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS, apiVersion=0, > clientId=1000, correlationId=14, headerVersion=2) – > AssignReplicasToDirsRequestData(brokerId=1000, brokerEpoch=5, > directories=[DirectoryData(id=w_uxN7pwQ6eXSMrOKceYIQ, > topics=[TopicData(topicId=bvAKLSwmR7iJoKv2yZgygQ, > partitions=[PartitionData(partitionIndex=2), > PartitionData(partitionIndex=1)]), TopicData(topicId=uNe7f5VrQgO0zST6yH1jDQ, > partitions=[PartitionData(partitionIndex=0)])])]) with context > RequestContext(header=RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS, > apiVersion=0, clientId=1000, correlationId=14, headerVersion=2), > connectionId='172.16.14.219:9090-172.16.14.217:53590-7', > clientAddress=/[172.16.14.217|http://172.16.14.217/], > principal=User:CN=my-cluster-kafka,O=io.strimzi, > listenerName=ListenerName(CONTROLPLANE-9090), securityProtocol=SSL, > clientInformation=ClientInformation(softwareName=apache-kafka-java, > softwareVersion=3.7.0), fromPrivilegedListener=false, > principalSerde=Optional[org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder@71004ad2]) > (kafka.server.ControllerApis) [quorum-controller-0-event-handler] > java.util.concurrent.CompletionException: > org.apache.kafka.common.errors.UnsupportedVersionException: Directory > assignment is not supported yet. > at > java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332) > at > java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347) > at > java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) > at > java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) > at > org.apache.kafka.controller.QuorumController$ControllerWriteEvent.complete(QuorumController.java:880) > at > org.apache.kafka.controller.QuorumController$ControllerWriteEvent.handleException(QuorumController.java:871) > at > org.apache.kafka.queue.KafkaEventQueue$EventContext.completeWithException(KafkaEventQueue.java:148) > at > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:137) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181) > at java.base/java.lang.Thread.run(Thread.java:840) > Caused by: org.apache.kafka.common.errors.UnsupportedVersionException: > Directory assignment is not supported yet. > {quote} > > With the metadata version set to 3.6-IV2, it makes sense that the request is > not supported. But the request should in such case not be sent at all. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16132) Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable
[ https://issues.apache.org/jira/browse/KAFKA-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806799#comment-17806799 ] Luke Chen commented on KAFKA-16132: --- [~cmccabe] [~soarez] , please take a look. Thanks. > Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable > -- > > Key: KAFKA-16132 > URL: https://issues.apache.org/jira/browse/KAFKA-16132 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Luke Chen >Priority: Major > > When upgrading from 3.6 to 3.7, we noticed that after upgrade the metadata > version, all the partitions will be reset at one time, which causes a short > period of time unavailable. This doesn't happen before. > {code:java} > [2024-01-15 20:45:19,757] INFO [BrokerMetadataPublisher id=2] Updating > metadata.version to 19 at offset OffsetAndEpoch(offset=229, epoch=2). > (kafka.server.metadata.BrokerMetadataPublisher) > [2024-01-15 20:45:29,915] INFO [ReplicaFetcherManager on broker 2] Removed > fetcher for partitions Set(t1-29, t1-25, t1-21, t1-17, t1-46, t1-13, t1-42, > t1-9, t1-38, t1-5, t1-34, t1-1, t1-30, t1-26, t1-22, t1-18, t1-47, t1-14, > t1-43, t1-10, t1-39, t1-6, t1-35, t1-2, t1-31, t1-27, t1-23, t1-19, t1-48, > t1-15, t1-44, t1-11, t1-40, t1-7, t1-36, t1-3, t1-32, t1-28, t1-24, t1-20, > t1-49, t1-16, t1-45, t1-12, t1-41, t1-8, t1-37, t1-4, t1-33, t1-0) > (kafka.server.ReplicaFetcherManager) > {code} > Complete log: > https://gist.github.com/showuon/665aa3ce6afd59097a2662f8260ecc10 > Steps: > 1. start up a 3.6 kafka cluster in KRaft with 1 broker > 2. create a topic > 3. upgrade the binary to 3.7 > 4. use kafka-features.sh to upgrade to 3.7 metadata version > 5. check the log (and metrics if interested) > Analysis: > In 3.7, we have JBOD support in KRaft, so the partitionRegistration added a > new directory field. And it causes diff found while comparing delta. We might > be able to identify this adding directory change doesn't need to reset the > leader/follower state, and just update the metadata, to avoid causing > unavailability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16132) Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable
Luke Chen created KAFKA-16132: - Summary: Upgrading from 3.6 to 3.7 in KRaft will have seconds of partitions unavailable Key: KAFKA-16132 URL: https://issues.apache.org/jira/browse/KAFKA-16132 Project: Kafka Issue Type: Bug Affects Versions: 3.7.0 Reporter: Luke Chen When upgrading from 3.6 to 3.7, we noticed that after upgrade the metadata version, all the partitions will be reset at one time, which causes a short period of time unavailable. This doesn't happen before. {code:java} [2024-01-15 20:45:19,757] INFO [BrokerMetadataPublisher id=2] Updating metadata.version to 19 at offset OffsetAndEpoch(offset=229, epoch=2). (kafka.server.metadata.BrokerMetadataPublisher) [2024-01-15 20:45:29,915] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions Set(t1-29, t1-25, t1-21, t1-17, t1-46, t1-13, t1-42, t1-9, t1-38, t1-5, t1-34, t1-1, t1-30, t1-26, t1-22, t1-18, t1-47, t1-14, t1-43, t1-10, t1-39, t1-6, t1-35, t1-2, t1-31, t1-27, t1-23, t1-19, t1-48, t1-15, t1-44, t1-11, t1-40, t1-7, t1-36, t1-3, t1-32, t1-28, t1-24, t1-20, t1-49, t1-16, t1-45, t1-12, t1-41, t1-8, t1-37, t1-4, t1-33, t1-0) (kafka.server.ReplicaFetcherManager) {code} Complete log: https://gist.github.com/showuon/665aa3ce6afd59097a2662f8260ecc10 Steps: 1. start up a 3.6 kafka cluster in KRaft with 1 broker 2. create a topic 3. upgrade the binary to 3.7 4. use kafka-features.sh to upgrade to 3.7 metadata version 5. check the log (and metrics if interested) Analysis: In 3.7, we have JBOD support in KRaft, so the partitionRegistration added a new directory field. And it causes diff found while comparing delta. We might be able to identify this adding directory change doesn't need to reset the leader/follower state, and just update the metadata, to avoid causing unavailability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16101) Kafka cluster unavailable during KRaft migration rollback procedure
[ https://issues.apache.org/jira/browse/KAFKA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805548#comment-17805548 ] Luke Chen edited comment on KAFKA-16101 at 1/11/24 12:12 PM: - It seems we didn't consider it during our KIP design. I'm thinking we might need another flag to let brokers know this is rollbacking and update the controller znode in ZK in brokers. Thoughts? was (Author: showuon): It seems we didn't consider it during our KIP design. I'm thinking we might need another flag to let brokers know this is rollbacking and update the controller node in ZK for brokers. Thoughts? > Kafka cluster unavailable during KRaft migration rollback procedure > --- > > Key: KAFKA-16101 > URL: https://issues.apache.org/jira/browse/KAFKA-16101 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.6.1 >Reporter: Paolo Patierno >Priority: Major > > Hello, > I was trying the KRaft migration rollback procedure locally and I came across > a potential bug or anyway a situation where the cluster is not > usable/available for a certain amount of time. > In order to test the procedure, I start with a one broker (broker ID = 0) and > one zookeeper node cluster. Then I start the migration with a one KRaft > controller node (broker ID = 1). The migration runs fine and it reaches the > point of "dual write" state. > From this point, I try to run the rollback procedure as described in the > documentation. > As first step, this involves ... > * stopping the broker > * removing the __cluster_metadata folder > * removing ZooKeeper migration flag and controller(s) related configuration > from the broker > * restarting the broker > With the above steps done, the broker starts in ZooKeeper mode (no migration, > no KRaft controllers knowledge) and it keeps logging the following messages > in DEBUG: > {code:java} > [2024-01-08 11:51:20,608] DEBUG > [zk-broker-0-to-controller-forwarding-channel-manager]: Controller isn't > cached, looking for local metadata changes > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,608] DEBUG > [zk-broker-0-to-controller-forwarding-channel-manager]: No controller > provided, retrying after backoff > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,629] DEBUG > [zk-broker-0-to-controller-alter-partition-channel-manager]: Controller isn't > cached, looking for local metadata changes > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,629] DEBUG > [zk-broker-0-to-controller-alter-partition-channel-manager]: No controller > provided, retrying after backoff > (kafka.server.BrokerToControllerRequestThread) {code} > What's happening should be clear. > The /controller znode in ZooKeeper still reports the KRaft controller (broker > ID = 1) as controller. The broker gets it from the znode but doesn't know how > to reach it. > The issue is that until the procedure isn't fully completed with the next > steps (shutting down KRaft controller, deleting /controller znode), the > cluster is unusable. Any admin or client operation against the broker doesn't > work, just hangs, the broker doesn't reply. > Imagining this scenario to a more complex one with 10-20-50 brokers and > partitions' replicas spread across them, when the brokers are rolled one by > one (in ZK mode) reporting the above error, the topics will become not > available one after the other, until all brokers are in such a state and > nothing can work. This is because from a KRaft controller perspective (still > running), the brokers are not available anymore and the partitions' replicas > are out of sync. > Of course, as soon as you complete the rollback procedure, after deleting the > /controller znode, the brokers are able to elect a new controller among them > and everything recovers to work. > My first question ... isn't the cluster supposed to work during rollback and > being always available during the rollback when the procedure is not > completed yet? Or having the cluster not available is an assumption during > the rollback, until it's fully completed? > This "unavailability" time window could be reduced by deleting the > /controller znode before shutting down the KRaft controllers to allow the > brokers electing a new controller among them, but in this case, could there > be a race condition where KRaft controllers still running could steal > leadership again? > Or is there anything missing in the documentation maybe which is driving to > this problem? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16101) Kafka cluster unavailable during KRaft migration rollback procedure
[ https://issues.apache.org/jira/browse/KAFKA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805548#comment-17805548 ] Luke Chen commented on KAFKA-16101: --- It seems we didn't consider it during our KIP design. I'm thinking we might need another flag to let brokers know this is rollbacking and update the controller node in ZK for brokers. Thoughts? > Kafka cluster unavailable during KRaft migration rollback procedure > --- > > Key: KAFKA-16101 > URL: https://issues.apache.org/jira/browse/KAFKA-16101 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.6.1 >Reporter: Paolo Patierno >Priority: Major > > Hello, > I was trying the KRaft migration rollback procedure locally and I came across > a potential bug or anyway a situation where the cluster is not > usable/available for a certain amount of time. > In order to test the procedure, I start with a one broker (broker ID = 0) and > one zookeeper node cluster. Then I start the migration with a one KRaft > controller node (broker ID = 1). The migration runs fine and it reaches the > point of "dual write" state. > From this point, I try to run the rollback procedure as described in the > documentation. > As first step, this involves ... > * stopping the broker > * removing the __cluster_metadata folder > * removing ZooKeeper migration flag and controller(s) related configuration > from the broker > * restarting the broker > With the above steps done, the broker starts in ZooKeeper mode (no migration, > no KRaft controllers knowledge) and it keeps logging the following messages > in DEBUG: > {code:java} > [2024-01-08 11:51:20,608] DEBUG > [zk-broker-0-to-controller-forwarding-channel-manager]: Controller isn't > cached, looking for local metadata changes > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,608] DEBUG > [zk-broker-0-to-controller-forwarding-channel-manager]: No controller > provided, retrying after backoff > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,629] DEBUG > [zk-broker-0-to-controller-alter-partition-channel-manager]: Controller isn't > cached, looking for local metadata changes > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,629] DEBUG > [zk-broker-0-to-controller-alter-partition-channel-manager]: No controller > provided, retrying after backoff > (kafka.server.BrokerToControllerRequestThread) {code} > What's happening should be clear. > The /controller znode in ZooKeeper still reports the KRaft controller (broker > ID = 1) as controller. The broker gets it from the znode but doesn't know how > to reach it. > The issue is that until the procedure isn't fully completed with the next > steps (shutting down KRaft controller, deleting /controller znode), the > cluster is unusable. Any admin or client operation against the broker doesn't > work, just hangs, the broker doesn't reply. > Imagining this scenario to a more complex one with 10-20-50 brokers and > partitions' replicas spread across them, when the brokers are rolled one by > one (in ZK mode) reporting the above error, the topics will become not > available one after the other, until all brokers are in such a state and > nothing can work. This is because from a KRaft controller perspective (still > running), the brokers are not available anymore and the partitions' replicas > are out of sync. > Of course, as soon as you complete the rollback procedure, after deleting the > /controller znode, the brokers are able to elect a new controller among them > and everything recovers to work. > My first question ... isn't the cluster supposed to work during rollback and > being always available during the rollback when the procedure is not > completed yet? Or having the cluster not available is an assumption during > the rollback, until it's fully completed? > This "unavailability" time window could be reduced by deleting the > /controller znode before shutting down the KRaft controllers to allow the > brokers electing a new controller among them, but in this case, could there > be a race condition where KRaft controllers still running could steal > leadership again? > Or is there anything missing in the documentation maybe which is driving to > this problem? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16101) Kafka cluster unavailable during KRaft migration rollback procedure
[ https://issues.apache.org/jira/browse/KAFKA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805009#comment-17805009 ] Luke Chen commented on KAFKA-16101: --- [~cmccabe][~mumrah], I'd like to hear your thoughts about this availability issue during rolling back from KRaft mode to ZK mode. Thanks. > Kafka cluster unavailable during KRaft migration rollback procedure > --- > > Key: KAFKA-16101 > URL: https://issues.apache.org/jira/browse/KAFKA-16101 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.6.1 >Reporter: Paolo Patierno >Priority: Major > > Hello, > I was trying the KRaft migration rollback procedure locally and I came across > a potential bug or anyway a situation where the cluster is not > usable/available for a certain amount of time. > In order to test the procedure, I start with a one broker (broker ID = 0) and > one zookeeper node cluster. Then I start the migration with a one KRaft > controller node (broker ID = 1). The migration runs fine and it reaches the > point of "dual write" state. > From this point, I try to run the rollback procedure as described in the > documentation. > As first step, this involves ... > * stopping the broker > * removing the __cluster_metadata folder > * removing ZooKeeper migration flag and controller(s) related configuration > from the broker > * restarting the broker > With the above steps done, the broker starts in ZooKeeper mode (no migration, > no KRaft controllers knowledge) and it keeps logging the following messages > in DEBUG: > {code:java} > [2024-01-08 11:51:20,608] DEBUG > [zk-broker-0-to-controller-forwarding-channel-manager]: Controller isn't > cached, looking for local metadata changes > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,608] DEBUG > [zk-broker-0-to-controller-forwarding-channel-manager]: No controller > provided, retrying after backoff > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,629] DEBUG > [zk-broker-0-to-controller-alter-partition-channel-manager]: Controller isn't > cached, looking for local metadata changes > (kafka.server.BrokerToControllerRequestThread) > [2024-01-08 11:51:20,629] DEBUG > [zk-broker-0-to-controller-alter-partition-channel-manager]: No controller > provided, retrying after backoff > (kafka.server.BrokerToControllerRequestThread) {code} > What's happening should be clear. > The /controller znode in ZooKeeper still reports the KRaft controller (broker > ID = 1) as controller. The broker gets it from the znode but doesn't know how > to reach it. > The issue is that until the procedure isn't fully completed with the next > steps (shutting down KRaft controller, deleting /controller znode), the > cluster is unusable. Any admin or client operation against the broker doesn't > work, just hangs, the broker doesn't reply. > Imagining this scenario to a more complex one with 10-20-50 brokers and > partitions' replicas spread across them, when the brokers are rolled one by > one (in ZK mode) reporting the above error, the topics will become not > available one after the other, until all brokers are in such a state and > nothing can work. This is because from a KRaft controller perspective (still > running), the brokers are not available anymore and the partitions' replicas > are out of sync. > Of course, as soon as you complete the rollback procedure, after deleting the > /controller znode, the brokers are able to elect a new controller among them > and everything recovers to work. > My first question ... isn't the cluster supposed to work during rollback and > being always available during the rollback when the procedure is not > completed yet? Or having the cluster not available is an assumption during > the rollback, until it's fully completed? > This "unavailability" time window could be reduced by deleting the > /controller znode before shutting down the KRaft controllers to allow the > brokers electing a new controller among them, but in this case, could there > be a race condition where KRaft controllers still running could steal > leadership again? > Or is there anything missing in the documentation maybe which is driving to > this problem? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16085) remote copy lag bytes/segments metrics don't update all topic value
[ https://issues.apache.org/jira/browse/KAFKA-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803452#comment-17803452 ] Luke Chen commented on KAFKA-16085: --- [~christo_lolov] [~satishd] [~ckamal], this is the part we missed. FYI > remote copy lag bytes/segments metrics don't update all topic value > --- > > Key: KAFKA-16085 > URL: https://issues.apache.org/jira/browse/KAFKA-16085 > Project: Kafka > Issue Type: Bug >Reporter: Luke Chen >Priority: Major > > The metrics added in > [KIP-963|https://cwiki.apache.org/confluence/display/KAFKA/KIP-963%3A+Additional+metrics+in+Tiered+Storage#KIP963:AdditionalmetricsinTieredStorage-Copy] > is BrokerTopicMetrics, which means it should provide per-topic metric value > and all topics metric value. But current implementation doesn't update all > topic metric value. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16085) remote copy lag bytes/segments metrics don't update all topic value
Luke Chen created KAFKA-16085: - Summary: remote copy lag bytes/segments metrics don't update all topic value Key: KAFKA-16085 URL: https://issues.apache.org/jira/browse/KAFKA-16085 Project: Kafka Issue Type: Bug Reporter: Luke Chen The metrics added in [KIP-963|https://cwiki.apache.org/confluence/display/KAFKA/KIP-963%3A+Additional+metrics+in+Tiered+Storage#KIP963:AdditionalmetricsinTieredStorage-Copy] is BrokerTopicMetrics, which means it should provide per-topic metric value and all topics metric value. But current implementation doesn't update all topic metric value. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16079) Fix leak in LocalLeaderEndPointTest/FinalizedFeatureChangeListenerTest/KafkaApisTest/ReplicaManagerConcurrencyTest
[ https://issues.apache.org/jira/browse/KAFKA-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16079. --- Fix Version/s: 3.7.0 Resolution: Fixed > Fix leak in > LocalLeaderEndPointTest/FinalizedFeatureChangeListenerTest/KafkaApisTest/ReplicaManagerConcurrencyTest > -- > > Key: KAFKA-16079 > URL: https://issues.apache.org/jira/browse/KAFKA-16079 > Project: Kafka > Issue Type: Sub-task >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > Fix For: 3.7.0 > > > Fix leak in > LocalLeaderEndPointTest/FinalizedFeatureChangeListenerTest/KafkaApisTest/ReplicaManagerConcurrencyTest -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16079) Fix leak in LocalLeaderEndPointTest/FinalizedFeatureChangeListenerTest/KafkaApisTest/ReplicaManagerConcurrencyTest
Luke Chen created KAFKA-16079: - Summary: Fix leak in LocalLeaderEndPointTest/FinalizedFeatureChangeListenerTest/KafkaApisTest/ReplicaManagerConcurrencyTest Key: KAFKA-16079 URL: https://issues.apache.org/jira/browse/KAFKA-16079 Project: Kafka Issue Type: Sub-task Reporter: Luke Chen Assignee: Luke Chen Fix leak in LocalLeaderEndPointTest/FinalizedFeatureChangeListenerTest/KafkaApisTest/ReplicaManagerConcurrencyTest -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16071) NPE in testDescribeUnderReplicatedPartitionsWhenReassignmentIsInProgress
Luke Chen created KAFKA-16071: - Summary: NPE in testDescribeUnderReplicatedPartitionsWhenReassignmentIsInProgress Key: KAFKA-16071 URL: https://issues.apache.org/jira/browse/KAFKA-16071 Project: Kafka Issue Type: Test Reporter: Luke Chen Found in the CI build result. h3. Error Message java.lang.NullPointerException h3. Stacktrace java.lang.NullPointerException at org.apache.kafka.tools.TopicCommandIntegrationTest.testDescribeUnderReplicatedPartitionsWhenReassignmentIsInProgress(TopicCommandIntegrationTest.java:800) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:728) at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-15095/1/testReport/junit/org.apache.kafka.tools/TopicCommandIntegrationTest/Build___JDK_8_and_Scala_2_12___testDescribeUnderReplicatedPartitionsWhenReassignmentIsInProgress_String__zk/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16065) Fix leak in DelayedOperationTest
Luke Chen created KAFKA-16065: - Summary: Fix leak in DelayedOperationTest Key: KAFKA-16065 URL: https://issues.apache.org/jira/browse/KAFKA-16065 Project: Kafka Issue Type: Sub-task Reporter: Luke Chen Assignee: Luke Chen Fix leak in DelayedOperationTest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16064) improve ControllerApiTest
[ https://issues.apache.org/jira/browse/KAFKA-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16064: -- Labels: newbie newbie++ (was: ) > improve ControllerApiTest > - > > Key: KAFKA-16064 > URL: https://issues.apache.org/jira/browse/KAFKA-16064 > Project: Kafka > Issue Type: Test >Reporter: Luke Chen >Priority: Major > Labels: newbie, newbie++ > > It's usually more robust to automatically handle clean-up during tearDown by > instrumenting the create method so that it keeps track of all creations. > > context: > https://github.com/apache/kafka/pull/15084#issuecomment-1871302733 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16064) improve ControllerApiTest
Luke Chen created KAFKA-16064: - Summary: improve ControllerApiTest Key: KAFKA-16064 URL: https://issues.apache.org/jira/browse/KAFKA-16064 Project: Kafka Issue Type: Test Reporter: Luke Chen It's usually more robust to automatically handle clean-up during tearDown by instrumenting the create method so that it keeps track of all creations. context: https://github.com/apache/kafka/pull/15084#issuecomment-1871302733 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16058) Fix leaked in ControllerApiTest
Luke Chen created KAFKA-16058: - Summary: Fix leaked in ControllerApiTest Key: KAFKA-16058 URL: https://issues.apache.org/jira/browse/KAFKA-16058 Project: Kafka Issue Type: Sub-task Reporter: Luke Chen Assignee: Luke Chen PR: https://github.com/apache/kafka/pull/15084 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16052) OOM in Kafka test suite
[ https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800978#comment-17800978 ] Luke Chen edited comment on KAFKA-16052 at 12/28/23 12:13 PM: -- Sharing my way to detect the leaking threads: [https://github.com/apache/kafka/pull/15052/files#diff-b8f9f9d1b191457cbdb332a3429f0ad65b50fa4cef5af8562abcfd1f177a2cfeR2441] In this drafted PR, I added some "expected thread names" list (white list), and try to find threads that are not expected. The verification will be checked on each time `QuorumTestHarness` run (beforeAll/afterAll). The CI result [here|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-15052/5] shows the unexpected threads detected. Usually, I have to trace back to earlier test cases to find out which one leaked the thread. But that, at least, give us some clue. was (Author: showuon): Sharing my way to detect the leaking threads: [https://github.com/apache/kafka/pull/15052/files#diff-b8f9f9d1b191457cbdb332a3429f0ad65b50fa4cef5af8562abcfd1f177a2cfeR2441] In this drafted PR, I added some "expected thread names" list (white list), and try to find threads that are not expected. The verification will be checked on each time `QuorumTestHarness` run (beforeAll/afterAll). The CI result [here|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-15052/5] shows the unexpected threads detected. Usually, I have to trace back to earlier test cases to find out which one leaked the thread. But that's at least give us some clue. > OOM in Kafka test suite > --- > > Key: KAFKA-16052 > URL: https://issues.apache.org/jira/browse/KAFKA-16052 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Divij Vaidya >Priority: Major > Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot > 2023-12-27 at 14.22.21.png, Screenshot 2023-12-27 at 14.45.20.png, Screenshot > 2023-12-27 at 15.31.09.png, Screenshot 2023-12-27 at 17.44.09.png, Screenshot > 2023-12-28 at 00.13.06.png, Screenshot 2023-12-28 at 00.18.56.png, Screenshot > 2023-12-28 at 11.26.03.png, Screenshot 2023-12-28 at 11.26.09.png, newRM.patch > > > *Problem* > Our test suite is failing with frequent OOM. Discussion in the mailing list > is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] > *Setup* > To find the source of leaks, I ran the :core:test build target with a single > thread (see below on how to do it) and attached a profiler to it. This Jira > tracks the list of action items identified from the analysis. > How to run tests using a single thread: > {code:java} > diff --git a/build.gradle b/build.gradle > index f7abbf4f0b..81df03f1ee 100644 > --- a/build.gradle > +++ b/build.gradle > @@ -74,9 +74,8 @@ ext { > "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" > )- maxTestForks = project.hasProperty('maxParallelForks') ? > maxParallelForks.toInteger() : Runtime.runtime.availableProcessors() > - maxScalacThreads = project.hasProperty('maxScalacThreads') ? > maxScalacThreads.toInteger() : > - Math.min(Runtime.runtime.availableProcessors(), 8) > + maxTestForks = 1 > + maxScalacThreads = 1 > userIgnoreFailures = project.hasProperty('ignoreFailures') ? > ignoreFailures : false userMaxTestRetries = > project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0 > diff --git a/gradle.properties b/gradle.properties > index 4880248cac..ee4b6e3bc1 100644 > --- a/gradle.properties > +++ b/gradle.properties > @@ -30,4 +30,4 @@ scalaVersion=2.13.12 > swaggerVersion=2.2.8 > task=build > org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC > -org.gradle.parallel=true > +org.gradle.parallel=false {code} > *Result of experiment* > This is how the heap memory utilized looks like, starting from tens of MB to > ending with 1.5GB (with spikes of 2GB) of heap being used as the test > executes. Note that the total number of threads also increases but it does > not correlate with sharp increase in heap memory usage. The heap dump is > available at > [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0] > > !Screenshot 2023-12-27 at 14.22.21.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16052) OOM in Kafka test suite
[ https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800978#comment-17800978 ] Luke Chen commented on KAFKA-16052: --- Sharing my way to detect the leaking threads: [https://github.com/apache/kafka/pull/15052/files#diff-b8f9f9d1b191457cbdb332a3429f0ad65b50fa4cef5af8562abcfd1f177a2cfeR2441] In this drafted PR, I added some "expected thread names" list (white list), and try to find threads that are not expected. The verification will be checked on each time `QuorumTestHarness` run (beforeAll/afterAll). The CI result [here|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-15052/5] shows the unexpected threads detected. Usually, I have to trace back to earlier test cases to find out which one leaked the thread. But that's at least give us some clue. > OOM in Kafka test suite > --- > > Key: KAFKA-16052 > URL: https://issues.apache.org/jira/browse/KAFKA-16052 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Divij Vaidya >Priority: Major > Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot > 2023-12-27 at 14.22.21.png, Screenshot 2023-12-27 at 14.45.20.png, Screenshot > 2023-12-27 at 15.31.09.png, Screenshot 2023-12-27 at 17.44.09.png, Screenshot > 2023-12-28 at 00.13.06.png, Screenshot 2023-12-28 at 00.18.56.png, Screenshot > 2023-12-28 at 11.26.03.png, Screenshot 2023-12-28 at 11.26.09.png, newRM.patch > > > *Problem* > Our test suite is failing with frequent OOM. Discussion in the mailing list > is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] > *Setup* > To find the source of leaks, I ran the :core:test build target with a single > thread (see below on how to do it) and attached a profiler to it. This Jira > tracks the list of action items identified from the analysis. > How to run tests using a single thread: > {code:java} > diff --git a/build.gradle b/build.gradle > index f7abbf4f0b..81df03f1ee 100644 > --- a/build.gradle > +++ b/build.gradle > @@ -74,9 +74,8 @@ ext { > "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" > )- maxTestForks = project.hasProperty('maxParallelForks') ? > maxParallelForks.toInteger() : Runtime.runtime.availableProcessors() > - maxScalacThreads = project.hasProperty('maxScalacThreads') ? > maxScalacThreads.toInteger() : > - Math.min(Runtime.runtime.availableProcessors(), 8) > + maxTestForks = 1 > + maxScalacThreads = 1 > userIgnoreFailures = project.hasProperty('ignoreFailures') ? > ignoreFailures : false userMaxTestRetries = > project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0 > diff --git a/gradle.properties b/gradle.properties > index 4880248cac..ee4b6e3bc1 100644 > --- a/gradle.properties > +++ b/gradle.properties > @@ -30,4 +30,4 @@ scalaVersion=2.13.12 > swaggerVersion=2.2.8 > task=build > org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC > -org.gradle.parallel=true > +org.gradle.parallel=false {code} > *Result of experiment* > This is how the heap memory utilized looks like, starting from tens of MB to > ending with 1.5GB (with spikes of 2GB) of heap being used as the test > executes. Note that the total number of threads also increases but it does > not correlate with sharp increase in heap memory usage. The heap dump is > available at > [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0] > > !Screenshot 2023-12-27 at 14.22.21.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16052) OOM in Kafka test suite
[ https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800967#comment-17800967 ] Luke Chen commented on KAFKA-16052: --- Great! Thanks! > OOM in Kafka test suite > --- > > Key: KAFKA-16052 > URL: https://issues.apache.org/jira/browse/KAFKA-16052 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Divij Vaidya >Priority: Major > Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot > 2023-12-27 at 14.22.21.png, Screenshot 2023-12-27 at 14.45.20.png, Screenshot > 2023-12-27 at 15.31.09.png, Screenshot 2023-12-27 at 17.44.09.png, Screenshot > 2023-12-28 at 00.13.06.png, Screenshot 2023-12-28 at 00.18.56.png, Screenshot > 2023-12-28 at 11.26.03.png, Screenshot 2023-12-28 at 11.26.09.png, newRM.patch > > > *Problem* > Our test suite is failing with frequent OOM. Discussion in the mailing list > is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] > *Setup* > To find the source of leaks, I ran the :core:test build target with a single > thread (see below on how to do it) and attached a profiler to it. This Jira > tracks the list of action items identified from the analysis. > How to run tests using a single thread: > {code:java} > diff --git a/build.gradle b/build.gradle > index f7abbf4f0b..81df03f1ee 100644 > --- a/build.gradle > +++ b/build.gradle > @@ -74,9 +74,8 @@ ext { > "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" > )- maxTestForks = project.hasProperty('maxParallelForks') ? > maxParallelForks.toInteger() : Runtime.runtime.availableProcessors() > - maxScalacThreads = project.hasProperty('maxScalacThreads') ? > maxScalacThreads.toInteger() : > - Math.min(Runtime.runtime.availableProcessors(), 8) > + maxTestForks = 1 > + maxScalacThreads = 1 > userIgnoreFailures = project.hasProperty('ignoreFailures') ? > ignoreFailures : false userMaxTestRetries = > project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0 > diff --git a/gradle.properties b/gradle.properties > index 4880248cac..ee4b6e3bc1 100644 > --- a/gradle.properties > +++ b/gradle.properties > @@ -30,4 +30,4 @@ scalaVersion=2.13.12 > swaggerVersion=2.2.8 > task=build > org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC > -org.gradle.parallel=true > +org.gradle.parallel=false {code} > *Result of experiment* > This is how the heap memory utilized looks like, starting from tens of MB to > ending with 1.5GB (with spikes of 2GB) of heap being used as the test > executes. Note that the total number of threads also increases but it does > not correlate with sharp increase in heap memory usage. The heap dump is > available at > [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0] > > !Screenshot 2023-12-27 at 14.22.21.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16052) OOM in Kafka test suite
[ https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800948#comment-17800948 ] Luke Chen commented on KAFKA-16052: --- OK, I've checked the output before my change, there are also many errors output. So it looks expected. I've created a PR for it: https://github.com/apache/kafka/pull/15083. [~divijvaidya], please verify in your env again when available (after holidays). Thanks. > OOM in Kafka test suite > --- > > Key: KAFKA-16052 > URL: https://issues.apache.org/jira/browse/KAFKA-16052 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Divij Vaidya >Priority: Major > Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot > 2023-12-27 at 14.22.21.png, Screenshot 2023-12-27 at 14.45.20.png, Screenshot > 2023-12-27 at 15.31.09.png, Screenshot 2023-12-27 at 17.44.09.png, Screenshot > 2023-12-28 at 00.13.06.png, Screenshot 2023-12-28 at 00.18.56.png, newRM.patch > > > *Problem* > Our test suite is failing with frequent OOM. Discussion in the mailing list > is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] > *Setup* > To find the source of leaks, I ran the :core:test build target with a single > thread (see below on how to do it) and attached a profiler to it. This Jira > tracks the list of action items identified from the analysis. > How to run tests using a single thread: > {code:java} > diff --git a/build.gradle b/build.gradle > index f7abbf4f0b..81df03f1ee 100644 > --- a/build.gradle > +++ b/build.gradle > @@ -74,9 +74,8 @@ ext { > "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" > )- maxTestForks = project.hasProperty('maxParallelForks') ? > maxParallelForks.toInteger() : Runtime.runtime.availableProcessors() > - maxScalacThreads = project.hasProperty('maxScalacThreads') ? > maxScalacThreads.toInteger() : > - Math.min(Runtime.runtime.availableProcessors(), 8) > + maxTestForks = 1 > + maxScalacThreads = 1 > userIgnoreFailures = project.hasProperty('ignoreFailures') ? > ignoreFailures : false userMaxTestRetries = > project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0 > diff --git a/gradle.properties b/gradle.properties > index 4880248cac..ee4b6e3bc1 100644 > --- a/gradle.properties > +++ b/gradle.properties > @@ -30,4 +30,4 @@ scalaVersion=2.13.12 > swaggerVersion=2.2.8 > task=build > org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC > -org.gradle.parallel=true > +org.gradle.parallel=false {code} > *Result of experiment* > This is how the heap memory utilized looks like, starting from tens of MB to > ending with 1.5GB (with spikes of 2GB) of heap being used as the test > executes. Note that the total number of threads also increases but it does > not correlate with sharp increase in heap memory usage. The heap dump is > available at > [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0] > > !Screenshot 2023-12-27 at 14.22.21.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16052) OOM in Kafka test suite
[ https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800942#comment-17800942 ] Luke Chen commented on KAFKA-16052: --- I'm trying to fix it by creating a "real" replicaManager, instead of a mock one. And I can see the heap size reduce a lot. Here's the patch: https://issues.apache.org/jira/secure/attachment/13065644/newRM.patch The only problem is because this is a real dummy replicaManager, it'll output many error while testing. I'm checking on it. > OOM in Kafka test suite > --- > > Key: KAFKA-16052 > URL: https://issues.apache.org/jira/browse/KAFKA-16052 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Divij Vaidya >Priority: Major > Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot > 2023-12-27 at 14.22.21.png, Screenshot 2023-12-27 at 14.45.20.png, Screenshot > 2023-12-27 at 15.31.09.png, Screenshot 2023-12-27 at 17.44.09.png, Screenshot > 2023-12-28 at 00.13.06.png, Screenshot 2023-12-28 at 00.18.56.png, newRM.patch > > > *Problem* > Our test suite is failing with frequent OOM. Discussion in the mailing list > is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] > *Setup* > To find the source of leaks, I ran the :core:test build target with a single > thread (see below on how to do it) and attached a profiler to it. This Jira > tracks the list of action items identified from the analysis. > How to run tests using a single thread: > {code:java} > diff --git a/build.gradle b/build.gradle > index f7abbf4f0b..81df03f1ee 100644 > --- a/build.gradle > +++ b/build.gradle > @@ -74,9 +74,8 @@ ext { > "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" > )- maxTestForks = project.hasProperty('maxParallelForks') ? > maxParallelForks.toInteger() : Runtime.runtime.availableProcessors() > - maxScalacThreads = project.hasProperty('maxScalacThreads') ? > maxScalacThreads.toInteger() : > - Math.min(Runtime.runtime.availableProcessors(), 8) > + maxTestForks = 1 > + maxScalacThreads = 1 > userIgnoreFailures = project.hasProperty('ignoreFailures') ? > ignoreFailures : false userMaxTestRetries = > project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0 > diff --git a/gradle.properties b/gradle.properties > index 4880248cac..ee4b6e3bc1 100644 > --- a/gradle.properties > +++ b/gradle.properties > @@ -30,4 +30,4 @@ scalaVersion=2.13.12 > swaggerVersion=2.2.8 > task=build > org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC > -org.gradle.parallel=true > +org.gradle.parallel=false {code} > *Result of experiment* > This is how the heap memory utilized looks like, starting from tens of MB to > ending with 1.5GB (with spikes of 2GB) of heap being used as the test > executes. Note that the total number of threads also increases but it does > not correlate with sharp increase in heap memory usage. The heap dump is > available at > [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0] > > !Screenshot 2023-12-27 at 14.22.21.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16052) OOM in Kafka test suite
[ https://issues.apache.org/jira/browse/KAFKA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-16052: -- Attachment: newRM.patch > OOM in Kafka test suite > --- > > Key: KAFKA-16052 > URL: https://issues.apache.org/jira/browse/KAFKA-16052 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Divij Vaidya >Priority: Major > Attachments: Screenshot 2023-12-27 at 14.04.52.png, Screenshot > 2023-12-27 at 14.22.21.png, Screenshot 2023-12-27 at 14.45.20.png, Screenshot > 2023-12-27 at 15.31.09.png, Screenshot 2023-12-27 at 17.44.09.png, Screenshot > 2023-12-28 at 00.13.06.png, Screenshot 2023-12-28 at 00.18.56.png, newRM.patch > > > *Problem* > Our test suite is failing with frequent OOM. Discussion in the mailing list > is here: [https://lists.apache.org/thread/d5js0xpsrsvhgjb10mbzo9cwsy8087x4] > *Setup* > To find the source of leaks, I ran the :core:test build target with a single > thread (see below on how to do it) and attached a profiler to it. This Jira > tracks the list of action items identified from the analysis. > How to run tests using a single thread: > {code:java} > diff --git a/build.gradle b/build.gradle > index f7abbf4f0b..81df03f1ee 100644 > --- a/build.gradle > +++ b/build.gradle > @@ -74,9 +74,8 @@ ext { > "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" > )- maxTestForks = project.hasProperty('maxParallelForks') ? > maxParallelForks.toInteger() : Runtime.runtime.availableProcessors() > - maxScalacThreads = project.hasProperty('maxScalacThreads') ? > maxScalacThreads.toInteger() : > - Math.min(Runtime.runtime.availableProcessors(), 8) > + maxTestForks = 1 > + maxScalacThreads = 1 > userIgnoreFailures = project.hasProperty('ignoreFailures') ? > ignoreFailures : false userMaxTestRetries = > project.hasProperty('maxTestRetries') ? maxTestRetries.toInteger() : 0 > diff --git a/gradle.properties b/gradle.properties > index 4880248cac..ee4b6e3bc1 100644 > --- a/gradle.properties > +++ b/gradle.properties > @@ -30,4 +30,4 @@ scalaVersion=2.13.12 > swaggerVersion=2.2.8 > task=build > org.gradle.jvmargs=-Xmx2g -Xss4m -XX:+UseParallelGC > -org.gradle.parallel=true > +org.gradle.parallel=false {code} > *Result of experiment* > This is how the heap memory utilized looks like, starting from tens of MB to > ending with 1.5GB (with spikes of 2GB) of heap being used as the test > executes. Note that the total number of threads also increases but it does > not correlate with sharp increase in heap memory usage. The heap dump is > available at > [https://www.dropbox.com/scl/fi/nwtgc6ir6830xlfy9z9cu/GradleWorkerMain_10311_27_12_2023_13_37_08.hprof.zip?rlkey=ozbdgh5vih4rcynnxbatzk7ln&dl=0] > > !Screenshot 2023-12-27 at 14.22.21.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16035) add integration test for ExpiresPerSec and RemoteLogSizeComputationTime metrics
[ https://issues.apache.org/jira/browse/KAFKA-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-16035. --- Fix Version/s: 3.7.0 Resolution: Fixed > add integration test for ExpiresPerSec and RemoteLogSizeComputationTime > metrics > --- > > Key: KAFKA-16035 > URL: https://issues.apache.org/jira/browse/KAFKA-16035 > Project: Kafka > Issue Type: Test >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > Fix For: 3.7.0 > > > add integration test for ExpiresPerSec and RemoteLogSizeComputationTime > metrics > https://github.com/apache/kafka/pull/15015/commits/517a7c19d5a19bc94f0f79c02a239fd1ff7f6991 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16014) Implement RemoteLogSizeComputationTime, RemoteLogSizeBytes, RemoteLogMetadataCount
[ https://issues.apache.org/jira/browse/KAFKA-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen reassigned KAFKA-16014: - Assignee: Luke Chen > Implement RemoteLogSizeComputationTime, RemoteLogSizeBytes, > RemoteLogMetadataCount > -- > > Key: KAFKA-16014 > URL: https://issues.apache.org/jira/browse/KAFKA-16014 > Project: Kafka > Issue Type: Sub-task >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > Fix For: 3.7.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16035) add integration test for ExpiresPerSec and RemoteLogSizeComputationTime metrics
Luke Chen created KAFKA-16035: - Summary: add integration test for ExpiresPerSec and RemoteLogSizeComputationTime metrics Key: KAFKA-16035 URL: https://issues.apache.org/jira/browse/KAFKA-16035 Project: Kafka Issue Type: Test Reporter: Luke Chen Assignee: Luke Chen add integration test for ExpiresPerSec and RemoteLogSizeComputationTime metrics https://github.com/apache/kafka/pull/15015/commits/517a7c19d5a19bc94f0f79c02a239fd1ff7f6991 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15158) Add metrics for RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec
[ https://issues.apache.org/jira/browse/KAFKA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-15158. --- Resolution: Fixed > Add metrics for RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, > BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec > -- > > Key: KAFKA-15158 > URL: https://issues.apache.org/jira/browse/KAFKA-15158 > Project: Kafka > Issue Type: Sub-task >Reporter: Divij Vaidya >Assignee: Gantigmaa Selenge >Priority: Major > Fix For: 3.7.0 > > > Add the following metrics for better observability into the RemoteLog related > activities inside the broker. > 1. RemoteWriteRequestsPerSec > 2. RemoteDeleteRequestsPerSec > 3. BuildRemoteLogAuxStateRequestsPerSec > > These metrics will be calculated at topic level (we can add them at > brokerTopicStats) > -*RemoteWriteRequestsPerSec* will be marked on every call to > RemoteLogManager#- > -copyLogSegmentsToRemote()- already covered by KAFKA-14953 > > *RemoteDeleteRequestsPerSec* will be marked on every call to > RemoteLogManager#cleanupExpiredRemoteLogSegments(). This method is introduced > in [https://github.com/apache/kafka/pull/13561] > *BuildRemoteLogAuxStateRequestsPerSec* will be marked on every call to > ReplicaFetcherTierStateMachine#buildRemoteLogAuxState() > > (Note: For all the above, add Error metrics as well such as > RemoteDeleteErrorPerSec) > (Note: This requires a change in KIP-405 and hence, must be approved by KIP > author [~satishd] ) > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16031) Enabling testApplyDeltaShouldHandleReplicaAssignedToOnlineDirectory for tiered storage after supporting JBOD
Luke Chen created KAFKA-16031: - Summary: Enabling testApplyDeltaShouldHandleReplicaAssignedToOnlineDirectory for tiered storage after supporting JBOD Key: KAFKA-16031 URL: https://issues.apache.org/jira/browse/KAFKA-16031 Project: Kafka Issue Type: Test Components: Tiered-Storage Reporter: Luke Chen Currently, tiered storage doesn't support JBOD (multiple log dirs). The test testApplyDeltaShouldHandleReplicaAssignedToOnlineDirectory requires multiple log dirs to run it. We should enable it for tiered storage after supporting JBOD in tiered storage. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15341) Enabling TS for a topic during rolling restart causes problems
[ https://issues.apache.org/jira/browse/KAFKA-15341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798145#comment-17798145 ] Luke Chen commented on KAFKA-15341: --- I can check it after 3.7 release. Quite busy these days, sorry. > Enabling TS for a topic during rolling restart causes problems > -- > > Key: KAFKA-15341 > URL: https://issues.apache.org/jira/browse/KAFKA-15341 > Project: Kafka > Issue Type: Bug >Reporter: Divij Vaidya >Assignee: Phuc Hong Tran >Priority: Major > Labels: KIP-405 > Fix For: 3.7.0 > > > When we are in a rolling restart to enable TS at system level, some brokers > have TS enabled on them and some don't. We send an alter config call to > enable TS for a topic, it hits a broker which has TS enabled, this broker > forwards it to the controller and controller will send the config update to > all brokers. When another broker which doesn't have TS enabled (because it > hasn't undergone the restart yet) gets this config change, it "should" fail > to apply it. But failing now is too late since alterConfig has already > succeeded since controller->broker config propagation is done async. > With this JIRA, we want to have controller check if TS is enabled on all > brokers before applying alter config to turn on TS for a topic. > Context: https://github.com/apache/kafka/pull/14176#discussion_r1291265129 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16013) Implement ExpiresPerSec metric
[ https://issues.apache.org/jira/browse/KAFKA-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17796680#comment-17796680 ] Luke Chen commented on KAFKA-16013: --- [~nikramakrishnan], if you have time, you can pick up this ticket: https://issues.apache.org/jira/browse/KAFKA-16014. Thanks. > Implement ExpiresPerSec metric > -- > > Key: KAFKA-16013 > URL: https://issues.apache.org/jira/browse/KAFKA-16013 > Project: Kafka > Issue Type: Sub-task >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16013) Implement ExpiresPerSec metric
[ https://issues.apache.org/jira/browse/KAFKA-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17796675#comment-17796675 ] Luke Chen commented on KAFKA-16013: --- [~nikramakrishnan], I'm working on it now, let me see if I can open a PR later. And you can comment there. > Implement ExpiresPerSec metric > -- > > Key: KAFKA-16013 > URL: https://issues.apache.org/jira/browse/KAFKA-16013 > Project: Kafka > Issue Type: Sub-task >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16013) Implement ExpiresPerSec metric
[ https://issues.apache.org/jira/browse/KAFKA-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen reassigned KAFKA-16013: - Assignee: Luke Chen > Implement ExpiresPerSec metric > -- > > Key: KAFKA-16013 > URL: https://issues.apache.org/jira/browse/KAFKA-16013 > Project: Kafka > Issue Type: Sub-task >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15147) Measure pending and outstanding Remote Segment operations
[ https://issues.apache.org/jira/browse/KAFKA-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15147: -- Description: https://cwiki.apache.org/confluence/display/KAFKA/KIP-963%3A+Upload+and+delete+lag+metrics+in+Tiered+Storage KAFKA-15833: RemoteCopyLagBytes KAFKA-16002: RemoteCopyLagSegments, RemoteDeleteLagBytes, RemoteDeleteLagSegments KAFKA-16013: ExpiresPerSec KAFKA-16014: RemoteLogSizeComputationTime, RemoteLogSizeBytes, RemoteLogMetadataCount KAFKA-15158: RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec Remote Log Segment operations (copy/delete) are executed by the Remote Storage Manager, and recorded by Remote Log Metadata Manager (e.g. default TopicBasedRLMM writes to the internal Kafka topic state changes on remote log segments). As executions run, fail, and retry; it will be important to know how many operations are pending and outstanding over time to alert operators. Pending operations are not enough to alert, as values can oscillate closer to zero. An additional condition needs to apply (running time > threshold) to consider an operation outstanding. Proposal: RemoteLogManager could be extended with 2 concurrent maps (pendingSegmentCopies, pendingSegmentDeletes) `Map[Uuid, Long]` to measure segmentId time when operation started, and based on this expose 2 metrics per operation: * pendingSegmentCopies: gauge of pendingSegmentCopies map * outstandingSegmentCopies: loop over pending ops, and if now - startedTime > timeout, then outstanding++ (maybe on debug level?) Is this a valuable metric to add to Tiered Storage? or better to solve on a custom RLMM implementation? Also, does it require a KIP? Thanks! was: KAFKA-15833: RemoteCopyLagBytes KAFKA-16002: RemoteCopyLagSegments, RemoteDeleteLagBytes, RemoteDeleteLagSegments KAFKA-16013: ExpiresPerSec KAFKA-16014: RemoteLogSizeComputationTime, RemoteLogSizeBytes, RemoteLogMetadataCount KAFKA-15158: RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec Remote Log Segment operations (copy/delete) are executed by the Remote Storage Manager, and recorded by Remote Log Metadata Manager (e.g. default TopicBasedRLMM writes to the internal Kafka topic state changes on remote log segments). As executions run, fail, and retry; it will be important to know how many operations are pending and outstanding over time to alert operators. Pending operations are not enough to alert, as values can oscillate closer to zero. An additional condition needs to apply (running time > threshold) to consider an operation outstanding. Proposal: RemoteLogManager could be extended with 2 concurrent maps (pendingSegmentCopies, pendingSegmentDeletes) `Map[Uuid, Long]` to measure segmentId time when operation started, and based on this expose 2 metrics per operation: * pendingSegmentCopies: gauge of pendingSegmentCopies map * outstandingSegmentCopies: loop over pending ops, and if now - startedTime > timeout, then outstanding++ (maybe on debug level?) Is this a valuable metric to add to Tiered Storage? or better to solve on a custom RLMM implementation? Also, does it require a KIP? Thanks! > Measure pending and outstanding Remote Segment operations > - > > Key: KAFKA-15147 > URL: https://issues.apache.org/jira/browse/KAFKA-15147 > Project: Kafka > Issue Type: Improvement > Components: core >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Christo Lolov >Priority: Major > Labels: tiered-storage > Fix For: 3.7.0 > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-963%3A+Upload+and+delete+lag+metrics+in+Tiered+Storage > > KAFKA-15833: RemoteCopyLagBytes > KAFKA-16002: RemoteCopyLagSegments, RemoteDeleteLagBytes, > RemoteDeleteLagSegments > KAFKA-16013: ExpiresPerSec > KAFKA-16014: RemoteLogSizeComputationTime, RemoteLogSizeBytes, > RemoteLogMetadataCount > KAFKA-15158: RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, > BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec > > Remote Log Segment operations (copy/delete) are executed by the Remote > Storage Manager, and recorded by Remote Log Metadata Manager (e.g. default > TopicBasedRLMM writes to the internal Kafka topic state changes on remote log > segments). > As executions run, fail, and retry; it will be important to know how many > operations are pending and outstanding over time to alert operators. > Pending operations are not enough to alert, as values can oscillate closer to > zero. An additional condition needs to apply (running time > threshold) to > consider an operation out
[jira] [Updated] (KAFKA-15147) Measure pending and outstanding Remote Segment operations
[ https://issues.apache.org/jira/browse/KAFKA-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15147: -- Description: KAFKA-15833: RemoteCopyLagBytes KAFKA-16002: RemoteCopyLagSegments, RemoteDeleteLagBytes, RemoteDeleteLagSegments KAFKA-16013: ExpiresPerSec KAFKA-16014: RemoteLogSizeComputationTime, RemoteLogSizeBytes, RemoteLogMetadataCount KAFKA-15158: RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec Remote Log Segment operations (copy/delete) are executed by the Remote Storage Manager, and recorded by Remote Log Metadata Manager (e.g. default TopicBasedRLMM writes to the internal Kafka topic state changes on remote log segments). As executions run, fail, and retry; it will be important to know how many operations are pending and outstanding over time to alert operators. Pending operations are not enough to alert, as values can oscillate closer to zero. An additional condition needs to apply (running time > threshold) to consider an operation outstanding. Proposal: RemoteLogManager could be extended with 2 concurrent maps (pendingSegmentCopies, pendingSegmentDeletes) `Map[Uuid, Long]` to measure segmentId time when operation started, and based on this expose 2 metrics per operation: * pendingSegmentCopies: gauge of pendingSegmentCopies map * outstandingSegmentCopies: loop over pending ops, and if now - startedTime > timeout, then outstanding++ (maybe on debug level?) Is this a valuable metric to add to Tiered Storage? or better to solve on a custom RLMM implementation? Also, does it require a KIP? Thanks! was: Remote Log Segment operations (copy/delete) are executed by the Remote Storage Manager, and recorded by Remote Log Metadata Manager (e.g. default TopicBasedRLMM writes to the internal Kafka topic state changes on remote log segments). As executions run, fail, and retry; it will be important to know how many operations are pending and outstanding over time to alert operators. Pending operations are not enough to alert, as values can oscillate closer to zero. An additional condition needs to apply (running time > threshold) to consider an operation outstanding. Proposal: RemoteLogManager could be extended with 2 concurrent maps (pendingSegmentCopies, pendingSegmentDeletes) `Map[Uuid, Long]` to measure segmentId time when operation started, and based on this expose 2 metrics per operation: * pendingSegmentCopies: gauge of pendingSegmentCopies map * outstandingSegmentCopies: loop over pending ops, and if now - startedTime > timeout, then outstanding++ (maybe on debug level?) Is this a valuable metric to add to Tiered Storage? or better to solve on a custom RLMM implementation? Also, does it require a KIP? Thanks! > Measure pending and outstanding Remote Segment operations > - > > Key: KAFKA-15147 > URL: https://issues.apache.org/jira/browse/KAFKA-15147 > Project: Kafka > Issue Type: Improvement > Components: core >Reporter: Jorge Esteban Quilcate Otoya >Assignee: Christo Lolov >Priority: Major > Labels: tiered-storage > Fix For: 3.7.0 > > > KAFKA-15833: RemoteCopyLagBytes > KAFKA-16002: RemoteCopyLagSegments, RemoteDeleteLagBytes, > RemoteDeleteLagSegments > KAFKA-16013: ExpiresPerSec > KAFKA-16014: RemoteLogSizeComputationTime, RemoteLogSizeBytes, > RemoteLogMetadataCount > KAFKA-15158: RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, > BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec > > Remote Log Segment operations (copy/delete) are executed by the Remote > Storage Manager, and recorded by Remote Log Metadata Manager (e.g. default > TopicBasedRLMM writes to the internal Kafka topic state changes on remote log > segments). > As executions run, fail, and retry; it will be important to know how many > operations are pending and outstanding over time to alert operators. > Pending operations are not enough to alert, as values can oscillate closer to > zero. An additional condition needs to apply (running time > threshold) to > consider an operation outstanding. > Proposal: > RemoteLogManager could be extended with 2 concurrent maps > (pendingSegmentCopies, pendingSegmentDeletes) `Map[Uuid, Long]` to measure > segmentId time when operation started, and based on this expose 2 metrics per > operation: > * pendingSegmentCopies: gauge of pendingSegmentCopies map > * outstandingSegmentCopies: loop over pending ops, and if now - startedTime > > timeout, then outstanding++ (maybe on debug level?) > Is this a valuable metric to add to Tiered Storage? or better to solve on a > custom RLMM implementation? > Also, does it require a KIP? > Thanks! -- This me
[jira] [Created] (KAFKA-16014) Implement RemoteLogSizeComputationTime, RemoteLogSizeBytes, RemoteLogMetadataCount
Luke Chen created KAFKA-16014: - Summary: Implement RemoteLogSizeComputationTime, RemoteLogSizeBytes, RemoteLogMetadataCount Key: KAFKA-16014 URL: https://issues.apache.org/jira/browse/KAFKA-16014 Project: Kafka Issue Type: Sub-task Reporter: Luke Chen -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16013) Implement ExpiresPerSec metric
Luke Chen created KAFKA-16013: - Summary: Implement ExpiresPerSec metric Key: KAFKA-16013 URL: https://issues.apache.org/jira/browse/KAFKA-16013 Project: Kafka Issue Type: Sub-task Reporter: Luke Chen -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15158) Add metrics for RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec
[ https://issues.apache.org/jira/browse/KAFKA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15158: -- Summary: Add metrics for RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec (was: Add metrics for RemoteRequestsPerSec) > Add metrics for RemoteDeleteRequestsPerSec, RemoteDeleteErrorsPerSec, > BuildRemoteLogAuxStateRequestsPerSec, BuildRemoteLogAuxStateErrorsPerSec > -- > > Key: KAFKA-15158 > URL: https://issues.apache.org/jira/browse/KAFKA-15158 > Project: Kafka > Issue Type: Sub-task >Reporter: Divij Vaidya >Assignee: Gantigmaa Selenge >Priority: Major > Fix For: 3.7.0 > > > Add the following metrics for better observability into the RemoteLog related > activities inside the broker. > 1. RemoteWriteRequestsPerSec > 2. RemoteDeleteRequestsPerSec > 3. BuildRemoteLogAuxStateRequestsPerSec > > These metrics will be calculated at topic level (we can add them at > brokerTopicStats) > -*RemoteWriteRequestsPerSec* will be marked on every call to > RemoteLogManager#- > -copyLogSegmentsToRemote()- already covered by KAFKA-14953 > > *RemoteDeleteRequestsPerSec* will be marked on every call to > RemoteLogManager#cleanupExpiredRemoteLogSegments(). This method is introduced > in [https://github.com/apache/kafka/pull/13561] > *BuildRemoteLogAuxStateRequestsPerSec* will be marked on every call to > ReplicaFetcherTierStateMachine#buildRemoteLogAuxState() > > (Note: For all the above, add Error metrics as well such as > RemoteDeleteErrorPerSec) > (Note: This requires a change in KIP-405 and hence, must be approved by KIP > author [~satishd] ) > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15158) Add metrics for RemoteRequestsPerSec
[ https://issues.apache.org/jira/browse/KAFKA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15158: -- Parent: KAFKA-15147 Issue Type: Sub-task (was: Improvement) > Add metrics for RemoteRequestsPerSec > > > Key: KAFKA-15158 > URL: https://issues.apache.org/jira/browse/KAFKA-15158 > Project: Kafka > Issue Type: Sub-task >Reporter: Divij Vaidya >Assignee: Gantigmaa Selenge >Priority: Major > Fix For: 3.7.0 > > > Add the following metrics for better observability into the RemoteLog related > activities inside the broker. > 1. RemoteWriteRequestsPerSec > 2. RemoteDeleteRequestsPerSec > 3. BuildRemoteLogAuxStateRequestsPerSec > > These metrics will be calculated at topic level (we can add them at > brokerTopicStats) > -*RemoteWriteRequestsPerSec* will be marked on every call to > RemoteLogManager#- > -copyLogSegmentsToRemote()- already covered by KAFKA-14953 > > *RemoteDeleteRequestsPerSec* will be marked on every call to > RemoteLogManager#cleanupExpiredRemoteLogSegments(). This method is introduced > in [https://github.com/apache/kafka/pull/13561] > *BuildRemoteLogAuxStateRequestsPerSec* will be marked on every call to > ReplicaFetcherTierStateMachine#buildRemoteLogAuxState() > > (Note: For all the above, add Error metrics as well such as > RemoteDeleteErrorPerSec) > (Note: This requires a change in KIP-405 and hence, must be approved by KIP > author [~satishd] ) > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15883) Implement RemoteCopyLagBytes
[ https://issues.apache.org/jira/browse/KAFKA-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15883: -- Fix Version/s: 3.7.0 > Implement RemoteCopyLagBytes > > > Key: KAFKA-15883 > URL: https://issues.apache.org/jira/browse/KAFKA-15883 > Project: Kafka > Issue Type: Sub-task >Reporter: Christo Lolov >Assignee: Christo Lolov >Priority: Major > Fix For: 3.7.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15529) Flaky test ReassignReplicaShrinkTest.executeTieredStorageTest
[ https://issues.apache.org/jira/browse/KAFKA-15529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15529: -- Fix Version/s: 3.7.0 > Flaky test ReassignReplicaShrinkTest.executeTieredStorageTest > - > > Key: KAFKA-15529 > URL: https://issues.apache.org/jira/browse/KAFKA-15529 > Project: Kafka > Issue Type: Test > Components: Tiered-Storage >Affects Versions: 3.6.0 >Reporter: Divij Vaidya >Priority: Blocker > Labels: flaky-test > Fix For: 3.7.0 > > > Example of failed CI build - > [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14449/3/testReport/junit/org.apache.kafka.tiered.storage.integration/ReassignReplicaShrinkTest/Build___JDK_21_and_Scala_2_13___executeTieredStorageTest_String__quorum_kraft_2/] > > {noformat} > org.opentest4j.AssertionFailedError: Number of fetch requests from broker 0 > to the tier storage does not match the expected value for topic-partition > topicA-1 ==> expected: <3> but was: <4> > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at > app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > at > app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > at > app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:559) > at > app//org.apache.kafka.tiered.storage.actions.ConsumeAction.doExecute(ConsumeAction.java:128) > at > app//org.apache.kafka.tiered.storage.TieredStorageTestAction.execute(TieredStorageTestAction.java:25) > at > app//org.apache.kafka.tiered.storage.TieredStorageTestHarness.executeTieredStorageTest(TieredStorageTestHarness.java:112){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15529) Flaky test ReassignReplicaShrinkTest.executeTieredStorageTest
[ https://issues.apache.org/jira/browse/KAFKA-15529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15529: -- Priority: Blocker (was: Major) > Flaky test ReassignReplicaShrinkTest.executeTieredStorageTest > - > > Key: KAFKA-15529 > URL: https://issues.apache.org/jira/browse/KAFKA-15529 > Project: Kafka > Issue Type: Test > Components: Tiered-Storage >Affects Versions: 3.6.0 >Reporter: Divij Vaidya >Priority: Blocker > Labels: flaky-test > > Example of failed CI build - > [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14449/3/testReport/junit/org.apache.kafka.tiered.storage.integration/ReassignReplicaShrinkTest/Build___JDK_21_and_Scala_2_13___executeTieredStorageTest_String__quorum_kraft_2/] > > {noformat} > org.opentest4j.AssertionFailedError: Number of fetch requests from broker 0 > to the tier storage does not match the expected value for topic-partition > topicA-1 ==> expected: <3> but was: <4> > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at > app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > at > app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > at > app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:559) > at > app//org.apache.kafka.tiered.storage.actions.ConsumeAction.doExecute(ConsumeAction.java:128) > at > app//org.apache.kafka.tiered.storage.TieredStorageTestAction.execute(TieredStorageTestAction.java:25) > at > app//org.apache.kafka.tiered.storage.TieredStorageTestHarness.executeTieredStorageTest(TieredStorageTestHarness.java:112){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15979) Add KIP-1001 CurrentControllerId metric
[ https://issues.apache.org/jira/browse/KAFKA-15979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-15979. --- Resolution: Duplicate Duplicated with KAFKA-15980. Closing this one. cc [~cmccabe] > Add KIP-1001 CurrentControllerId metric > --- > > Key: KAFKA-15979 > URL: https://issues.apache.org/jira/browse/KAFKA-15979 > Project: Kafka > Issue Type: Improvement >Reporter: Colin McCabe >Assignee: Colin McCabe >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-12670) KRaft support for unclean.leader.election.enable
[ https://issues.apache.org/jira/browse/KAFKA-12670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793974#comment-17793974 ] Luke Chen commented on KAFKA-12670: --- [~cmccabe], do you have any idea when KRaft will support unclean leader election? > KRaft support for unclean.leader.election.enable > > > Key: KAFKA-12670 > URL: https://issues.apache.org/jira/browse/KAFKA-12670 > Project: Kafka > Issue Type: Task >Reporter: Colin McCabe >Assignee: Ryan Dielhenn >Priority: Major > > Implement KRaft support for the unclean.leader.election.enable > configurations. These configurations can be set at the topic, broker, or > cluster level. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-15911) KRaft quorum leader should make sure the follower fetch is making progress
[ https://issues.apache.org/jira/browse/KAFKA-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen reassigned KAFKA-15911: - Assignee: Luke Chen > KRaft quorum leader should make sure the follower fetch is making progress > -- > > Key: KAFKA-15911 > URL: https://issues.apache.org/jira/browse/KAFKA-15911 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.7.0 >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > > Just because the leader returned a successful response to FETCH and > FETCH_SNAPSHOT doesn't mean that the followers were able to handle the > response correctly. > For example, imagine the case where the log end offset (LEO) is at 1000 and > all of the followers are continuously fetching at offset 0 without ever > increasing their fetch offset. This can happen if the followers encounter an > error when processing the FETCH or FETCH_SNAPSHOT response. > In this scenario the leader will never be able to increase the HWM. I think > that this scenario is specific to KRaft and doesn't exists in Raft because > KRaft is pull vs Raft which is push. > https://github.com/apache/kafka/pull/14428#pullrequestreview-1751408695 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15911) KRaft quorum leader should make sure the follower fetch is making progress
Luke Chen created KAFKA-15911: - Summary: KRaft quorum leader should make sure the follower fetch is making progress Key: KAFKA-15911 URL: https://issues.apache.org/jira/browse/KAFKA-15911 Project: Kafka Issue Type: Bug Components: kraft Reporter: Luke Chen Just because the leader returned a successful response to FETCH and FETCH_SNAPSHOT doesn't mean that the followers were able to handle the response correctly. For example, imagine the case where the log end offset (LEO) is at 1000 and all of the followers are continuously fetching at offset 0 without ever increasing their fetch offset. This can happen if the followers encounter an error when processing the FETCH or FETCH_SNAPSHOT response. In this scenario the leader will never be able to increase the HWM. I think that this scenario is specific to KRaft and doesn't exists in Raft because KRaft is pull vs Raft which is push. https://github.com/apache/kafka/pull/14428#pullrequestreview-1751408695 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15552) Duplicate Producer ID blocks during ZK migration
[ https://issues.apache.org/jira/browse/KAFKA-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786694#comment-17786694 ] Luke Chen commented on KAFKA-15552: --- Closing this ticket since the PR is merged. > Duplicate Producer ID blocks during ZK migration > > > Key: KAFKA-15552 > URL: https://issues.apache.org/jira/browse/KAFKA-15552 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.4.0, 3.5.0, 3.4.1, 3.6.0, 3.5.1 >Reporter: David Arthur >Assignee: David Arthur >Priority: Critical > Fix For: 3.5.2, 3.6.1 > > > When migrating producer ID blocks from ZK to KRaft, we are taking the current > producer ID block from ZK and writing it's "firstProducerId" into the > producer IDs KRaft record. However, in KRaft we store the _next_ producer ID > block in the log rather than storing the current block like ZK does. The end > result is that the first block given to a caller of AllocateProducerIds is a > duplicate of the last block allocated in ZK mode. > > This can result in duplicate producer IDs being given to transactional or > idempotent producers. In the case of transactional producers, this can cause > long term problems since the producer IDs are persisted and reused for a long > time. > The time between the last producer ID block being allocated by the ZK > controller and all the brokers being restarted following the metadata > migration is when this bug is possible. > > Symptoms of this bug will include ReplicaManager OutOfOrderSequenceException > and possibly some producer epoch validation errors. To see if a cluster is > affected by this bug, search for the offending producer ID and see if it is > being used by more than one producer. > > For example, the following error was observed > {code} > Out of order sequence number for producer 376000 at offset 381338 in > partition REDACTED: 0 (incoming seq. number), 21 (current end sequence > number) > {code} > Then searching for "376000" on > org.apache.kafka.clients.producer.internals.TransactionManager logs, two > brokers both show the same producer ID being provisioned > {code} > Broker 0 [Producer clientId=REDACTED-0] ProducerId set to 376000 with epoch 1 > Broker 5 [Producer clientId=REDACTED-1] ProducerId set to 376000 with epoch 1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15552) Duplicate Producer ID blocks during ZK migration
[ https://issues.apache.org/jira/browse/KAFKA-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-15552. --- Resolution: Fixed > Duplicate Producer ID blocks during ZK migration > > > Key: KAFKA-15552 > URL: https://issues.apache.org/jira/browse/KAFKA-15552 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.4.0, 3.5.0, 3.4.1, 3.6.0, 3.5.1 >Reporter: David Arthur >Assignee: David Arthur >Priority: Critical > Fix For: 3.5.2, 3.6.1 > > > When migrating producer ID blocks from ZK to KRaft, we are taking the current > producer ID block from ZK and writing it's "firstProducerId" into the > producer IDs KRaft record. However, in KRaft we store the _next_ producer ID > block in the log rather than storing the current block like ZK does. The end > result is that the first block given to a caller of AllocateProducerIds is a > duplicate of the last block allocated in ZK mode. > > This can result in duplicate producer IDs being given to transactional or > idempotent producers. In the case of transactional producers, this can cause > long term problems since the producer IDs are persisted and reused for a long > time. > The time between the last producer ID block being allocated by the ZK > controller and all the brokers being restarted following the metadata > migration is when this bug is possible. > > Symptoms of this bug will include ReplicaManager OutOfOrderSequenceException > and possibly some producer epoch validation errors. To see if a cluster is > affected by this bug, search for the offending producer ID and see if it is > being used by more than one producer. > > For example, the following error was observed > {code} > Out of order sequence number for producer 376000 at offset 381338 in > partition REDACTED: 0 (incoming seq. number), 21 (current end sequence > number) > {code} > Then searching for "376000" on > org.apache.kafka.clients.producer.internals.TransactionManager logs, two > brokers both show the same producer ID being provisioned > {code} > Broker 0 [Producer clientId=REDACTED-0] ProducerId set to 376000 with epoch 1 > Broker 5 [Producer clientId=REDACTED-1] ProducerId set to 376000 with epoch 1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15802) Trying to access uncopied segments metadata on listOffsets
[ https://issues.apache.org/jira/browse/KAFKA-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784705#comment-17784705 ] Luke Chen commented on KAFKA-15802: --- +1 to go with option 1 first. > Trying to access uncopied segments metadata on listOffsets > -- > > Key: KAFKA-15802 > URL: https://issues.apache.org/jira/browse/KAFKA-15802 > Project: Kafka > Issue Type: Bug > Components: Tiered-Storage >Affects Versions: 3.6.0 >Reporter: Francois Visconte >Priority: Major > > We have a tiered storage cluster running with Aiven s3 plugin. > On our cluster, we have a process doing regular listOffsets requests. > This triggers the following exception: > {code:java} > org.apache.kafka.common.KafkaException: > org.apache.kafka.server.log.remote.storage.RemoteResourceNotFoundException: > Requested remote resource was not found > at > org.apache.kafka.storage.internals.log.RemoteIndexCache.lambda$createCacheEntry$6(RemoteIndexCache.java:355) > at > org.apache.kafka.storage.internals.log.RemoteIndexCache.loadIndexFile(RemoteIndexCache.java:318) > Nov 09, 2023 1:42:01 PM com.github.benmanes.caffeine.cache.LocalAsyncCache > lambda$handleCompletion$7 > WARNING: Exception thrown during asynchronous load > java.util.concurrent.CompletionException: > io.aiven.kafka.tieredstorage.storage.KeyNotFoundException: Key > cluster/topic-0A_3phS5QWu9eU28KG0Lxg/24/00149691-Rdf4cUR_S4OYAGImco6Lbg.rsm-manifest > does not exists in storage S3Storage{bucketName='bucket', partSize=16777216} > at > com.github.benmanes.caffeine.cache.CacheLoader.lambda$asyncLoad$0(CacheLoader.java:107) > at > java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768) > at > java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1760) > at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373) > at > java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182) > at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655) > at > java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622) > at > java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165) > Caused by: io.aiven.kafka.tieredstorage.storage.KeyNotFoundException: Key > cluster/topic-0A_3phS5QWu9eU28KG0Lxg/24/00149691-Rdf4cUR_S4OYAGImco6Lbg.rsm-manifest > does not exists in storage S3Storage{bucketName='bucket', partSize=16777216} > at io.aiven.kafka.tieredstorage.storage.s3.S3Storage.fetch(S3Storage.java:80) > at > io.aiven.kafka.tieredstorage.manifest.SegmentManifestProvider.lambda$new$1(SegmentManifestProvider.java:59) > at > com.github.benmanes.caffeine.cache.CacheLoader.lambda$asyncLoad$0(CacheLoader.java:103) > ... 7 more > Caused by: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The > specified key does not exist. (Service: S3, Status Code: 404, Request ID: > CFMP27PVC9V2NNEM, Extended Request ID: > F5qqlV06qQJ5qCuWl91oueBaha0QLMBURJudnOnFDQk+YbgFcAg70JBATcARDxN44DGo+PpfZHAsum+ioYMoOw==) > at > software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleErrorResponse(CombinedResponseHandler.java:125) > at > software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleResponse(CombinedResponseHandler.java:82) > at > software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:60) > at > software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:41) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30) > at > software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:52) > at > software.amazon.awssdk.core.internal.htt
[jira] [Commented] (KAFKA-15552) Duplicate Producer ID blocks during ZK migration
[ https://issues.apache.org/jira/browse/KAFKA-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784346#comment-17784346 ] Luke Chen commented on KAFKA-15552: --- Backported to 3.5 and 3.6 branch. For 3.4 fix, since there are some conflicts needed to be resolved, skipping it now and added it when needed. > Duplicate Producer ID blocks during ZK migration > > > Key: KAFKA-15552 > URL: https://issues.apache.org/jira/browse/KAFKA-15552 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.4.0, 3.5.0, 3.4.1, 3.6.0, 3.5.1 >Reporter: David Arthur >Assignee: David Arthur >Priority: Critical > Fix For: 3.5.2, 3.6.1 > > > When migrating producer ID blocks from ZK to KRaft, we are taking the current > producer ID block from ZK and writing it's "firstProducerId" into the > producer IDs KRaft record. However, in KRaft we store the _next_ producer ID > block in the log rather than storing the current block like ZK does. The end > result is that the first block given to a caller of AllocateProducerIds is a > duplicate of the last block allocated in ZK mode. > > This can result in duplicate producer IDs being given to transactional or > idempotent producers. In the case of transactional producers, this can cause > long term problems since the producer IDs are persisted and reused for a long > time. > The time between the last producer ID block being allocated by the ZK > controller and all the brokers being restarted following the metadata > migration is when this bug is possible. > > Symptoms of this bug will include ReplicaManager OutOfOrderSequenceException > and possibly some producer epoch validation errors. To see if a cluster is > affected by this bug, search for the offending producer ID and see if it is > being used by more than one producer. > > For example, the following error was observed > {code} > Out of order sequence number for producer 376000 at offset 381338 in > partition REDACTED: 0 (incoming seq. number), 21 (current end sequence > number) > {code} > Then searching for "376000" on > org.apache.kafka.clients.producer.internals.TransactionManager logs, two > brokers both show the same producer ID being provisioned > {code} > Broker 0 [Producer clientId=REDACTED-0] ProducerId set to 376000 with epoch 1 > Broker 5 [Producer clientId=REDACTED-1] ProducerId set to 376000 with epoch 1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15552) Duplicate Producer ID blocks during ZK migration
[ https://issues.apache.org/jira/browse/KAFKA-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15552: -- Fix Version/s: (was: 3.4.2) > Duplicate Producer ID blocks during ZK migration > > > Key: KAFKA-15552 > URL: https://issues.apache.org/jira/browse/KAFKA-15552 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.4.0, 3.5.0, 3.4.1, 3.6.0, 3.5.1 >Reporter: David Arthur >Assignee: David Arthur >Priority: Critical > Fix For: 3.5.2, 3.6.1 > > > When migrating producer ID blocks from ZK to KRaft, we are taking the current > producer ID block from ZK and writing it's "firstProducerId" into the > producer IDs KRaft record. However, in KRaft we store the _next_ producer ID > block in the log rather than storing the current block like ZK does. The end > result is that the first block given to a caller of AllocateProducerIds is a > duplicate of the last block allocated in ZK mode. > > This can result in duplicate producer IDs being given to transactional or > idempotent producers. In the case of transactional producers, this can cause > long term problems since the producer IDs are persisted and reused for a long > time. > The time between the last producer ID block being allocated by the ZK > controller and all the brokers being restarted following the metadata > migration is when this bug is possible. > > Symptoms of this bug will include ReplicaManager OutOfOrderSequenceException > and possibly some producer epoch validation errors. To see if a cluster is > affected by this bug, search for the offending producer ID and see if it is > being used by more than one producer. > > For example, the following error was observed > {code} > Out of order sequence number for producer 376000 at offset 381338 in > partition REDACTED: 0 (incoming seq. number), 21 (current end sequence > number) > {code} > Then searching for "376000" on > org.apache.kafka.clients.producer.internals.TransactionManager logs, two > brokers both show the same producer ID being provisioned > {code} > Broker 0 [Producer clientId=REDACTED-0] ProducerId set to 376000 with epoch 1 > Broker 5 [Producer clientId=REDACTED-1] ProducerId set to 376000 with epoch 1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15800) Malformed connect source offsets corrupt other partitions with DataException
[ https://issues.apache.org/jira/browse/KAFKA-15800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784341#comment-17784341 ] Luke Chen commented on KAFKA-15800: --- [~gharris1727], thanks for reporting this regression issue. Do you have estimate time when this PR will be merged? I'm planning to create a v3.5.2 CR build soon. Thanks. > Malformed connect source offsets corrupt other partitions with DataException > > > Key: KAFKA-15800 > URL: https://issues.apache.org/jira/browse/KAFKA-15800 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 3.5.0, 3.6.0, 3.5.1 >Reporter: Greg Harris >Assignee: Greg Harris >Priority: Blocker > Fix For: 3.5.2, 3.7.0, 3.6.1 > > > The KafkaOffsetBackingStore consumer callback was recently augmented with a > call to OffsetUtils.processPartitionKey: > [https://github.com/apache/kafka/blob/f1e58a35d7aebbe72844faf3e5019d9aa7a85e4a/connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaOffsetBackingStore.java#L323] > This function deserializes the offset key, which may be malformed in the > topic: > [https://github.com/apache/kafka/blob/f1e58a35d7aebbe72844faf3e5019d9aa7a85e4a/connect/runtime/src/main/java/org/apache/kafka/connect/storage/OffsetUtils.java#L92] > When this happens, a DataException is thrown, and propagates to the > KafkaBasedLog try-catch surrounding the batch processing of the records: > [https://github.com/apache/kafka/blob/f1e58a35d7aebbe72844faf3e5019d9aa7a85e4a/connect/runtime/src/main/java/org/apache/kafka/connect/util/KafkaBasedLog.java#L445-L454] > For example: > {noformat} > ERROR Error polling: org.apache.kafka.connect.errors.DataException: > Converting byte[] to Kafka Connect data failed due to serialization error: > (org.apache.kafka.connect.util.KafkaBasedLog:453){noformat} > This means that one DataException for a malformed record may cause the > remainder of the batch to be dropped, corrupting the in-memory state of the > KafkaOffsetBackingStore. This prevents tasks using the > KafkaOffsetBackingStore from seeing all of the offsets in the topics, and can > cause duplicate records to be emitted. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15689) KRaftMigrationDriver not logging the skipped event when expected state is wrong
[ https://issues.apache.org/jira/browse/KAFKA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-15689. --- Fix Version/s: 3.7.0 Resolution: Fixed > KRaftMigrationDriver not logging the skipped event when expected state is > wrong > --- > > Key: KAFKA-15689 > URL: https://issues.apache.org/jira/browse/KAFKA-15689 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.0 >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Minor > Fix For: 3.7.0 > > > The KRaftMigrationDriver.checkDriverState is used in multiple implementations > of the > MigrationEvent base class but when it comes to log that an event was skipped > because the expected state is wrong, it always log "KRafrMigrationDriver" > instead of the skipped event. > For example, a logging line could be like this: > {code:java} > 2023-10-25 12:17:25,460 INFO [KRaftMigrationDriver id=5] Expected driver > state ZK_MIGRATION but found SYNC_KRAFT_TO_ZK. Not running this event > KRaftMigrationDriver. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [controller-5-migration-driver-event-handler] {code} > This is because its code has something like this: > {code:java} > log.info("Expected driver state {} but found {}. Not running this event {}.", > expectedState, migrationState, this.getClass().getSimpleName()); {code} > Of course, the "this" is referring to the KRafrMigrationDriver class. > It should print the specific skipped event instead. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15702) apiVersion request doesn't send when DNS is not ready at first
Luke Chen created KAFKA-15702: - Summary: apiVersion request doesn't send when DNS is not ready at first Key: KAFKA-15702 URL: https://issues.apache.org/jira/browse/KAFKA-15702 Project: Kafka Issue Type: Bug Affects Versions: 3.6.0 Reporter: Luke Chen When ZK migrating to KRaft, we [rely on apiVersions|https://github.com/apache/kafka/blob/3055cd7c180cac15016169c52383ddc204ca5f16/metadata/src/main/java/org/apache/kafka/controller/QuorumFeatures.java#L140] to check if all controllers enabled ZkMigration flag. But if the DNS is not updated with the latest info (frequently happen in k8s env), then the active controller won't send out the apiVersion request. The impact is the ZK migrating to KRaft will be blocked since the active controller will consider the follower is not ready. *The flow of the issue is like this:* 1. start up 3 controllers, ex: c1, c2, c3 2. The DNS doesn't update the host name entry of c3. 3. c1 becomes the leader, and send apiVersion request to c2 4. c1 is trying to connect to c3, and got unknownHost error 5. DNS is updated with c3 entry 6. c1 successfully connect to c3, but no apiVersion request sent. 7. The KRaftMigrationDriver keeps waiting for c3 ready for ZK migration Had a look, and it looks like the channel cannot successfully finishConnect [here|https://github.com/apache/kafka/blob/3.6/clients/src/main/java/org/apache/kafka/common/network/Selector.java#L527], so the channel won't be considered as connected, and initiate a apiVersion request. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15670) KRaft controller should set inter broker listener when migration
[ https://issues.apache.org/jira/browse/KAFKA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15670: -- Description: During ZK migrating to KRaft, before entering dual-write mode, the KRaft controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, and StopReplicaRequest) to the brokers. Currently, we use the inter broker listener to send the RPC to brokers from the controller. But in the doc, we didn't provide this info to users because the normal KRaft controller won't use `inter.broker.listener.names`. -Although these RPCs are used for ZK brokers, in our case, the sender is actually KRaft controller. In KRaft mode, the controller should talk with brokers via `controller.listener.names`, not `inter.broker.listener.names`.- The error while sending RPCs to brokers while the broker doesn't contain PLAINTEXT listener. {code:java} [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled error in SendRPCsToBrokersEvent (org.apache.kafka.server.fault.LoggingFaultHandler) kafka.common.BrokerEndPointNotAvailableException: End point with listener name PLAINTEXT not found for broker 0 at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) at scala.Option.getOrElse(Option.scala:201) at kafka.cluster.Broker.node(Broker.scala:93) at kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) at kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) at kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) at org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) {code} was: During ZK migrating to KRaft, before entering dual-write mode, the KRaft controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, and StopReplicaRequest) to the brokers. Currently, we use the inter broker listener to send the RPC to brokers from the controller. But in the doc, we didn't provide this info to users because the normal KRaft controller won't use `inter.broker.listener.names`. -Although these RPCs are used for ZK brokers, in our case, the sender is actually KRaft controller. In KRaft mode, the controller should talk with brokers via `controller.listener.names`, not `inter.broker.listener.names`. - The error while sending RPCs to brokers while the broker doesn't contain PLAINTEXT listener. {code:java} [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled error in SendRPCsToBrokersEvent (org.apache.kafka.server.fault.LoggingFaultHandler) kafka.common.BrokerEndPointNotAvailableException: End point with listener name PLAINTEXT not found for broker 0 at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) at scala.Option.getOrElse(Option.scala:201) at kafka.cluster.Broker.node(Broker.scala:93) at kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) at kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) at kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) at org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) {code} > KRaft controller should set inter broker listener when migration > > > Key: KAFKA-15670 > URL: https://issues.apache.org/jira/browse/KAFKA-15670 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.0 >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > > During ZK migrating to KRaft, before entering dual-write mode, the KRaft > controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, > and StopReplicaRequest) to the brokers. Currently, we use the inter broker > listener to send the RPC to brokers from the controller. Bu
[jira] [Updated] (KAFKA-15670) KRaft controller should set inter broker listener when migration
[ https://issues.apache.org/jira/browse/KAFKA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15670: -- Description: During ZK migrating to KRaft, before entering dual-write mode, the KRaft controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, and StopReplicaRequest) to the brokers. Currently, we use the inter broker listener to send the RPC to brokers from the controller. But in the doc, we didn't provide this info to users because the normal KRaft controller won't use `inter.broker.listener.names`. -Although these RPCs are used for ZK brokers, in our case, the sender is actually KRaft controller. In KRaft mode, the controller should talk with brokers via `controller.listener.names`, not `inter.broker.listener.names`. - The error while sending RPCs to brokers while the broker doesn't contain PLAINTEXT listener. {code:java} [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled error in SendRPCsToBrokersEvent (org.apache.kafka.server.fault.LoggingFaultHandler) kafka.common.BrokerEndPointNotAvailableException: End point with listener name PLAINTEXT not found for broker 0 at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) at scala.Option.getOrElse(Option.scala:201) at kafka.cluster.Broker.node(Broker.scala:93) at kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) at kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) at kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) at org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) {code} was: During ZK migrating to KRaft, before entering dual-write mode, the KRaft controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, and StopReplicaRequest) to the brokers. Currently, we use the inter broker listener to send the RPC to brokers from the controller. Although these RPCs are used for ZK brokers, in our case, the sender is actually KRaft controller. In KRaft mode, the controller should talk with brokers via `controller.listener.names`, not `inter.broker.listener.names`. This issue can also be fixed by adding `inter.broker.listener.names` config in KRaft controller configuration file, but it would be surprised that the KRaft controller config should contain `inter.broker.listener.names`. The error while sending RPCs to brokers while the broker doesn't contain PLAINTEXT listener. {code:java} [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled error in SendRPCsToBrokersEvent (org.apache.kafka.server.fault.LoggingFaultHandler) kafka.common.BrokerEndPointNotAvailableException: End point with listener name PLAINTEXT not found for broker 0 at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) at scala.Option.getOrElse(Option.scala:201) at kafka.cluster.Broker.node(Broker.scala:93) at kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) at kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) at kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) at org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) {code} > KRaft controller should set inter broker listener when migration > > > Key: KAFKA-15670 > URL: https://issues.apache.org/jira/browse/KAFKA-15670 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.0 >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > > During ZK migrating to KRaft, before entering dual-write mode, the KRaft > controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, > and StopReplicaRequest) to the brokers. Current
[jira] [Updated] (KAFKA-15670) KRaft controller should set inter broker listener when migration
[ https://issues.apache.org/jira/browse/KAFKA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15670: -- Summary: KRaft controller should set inter broker listener when migration (was: KRaft controller using wrong listener to send RPC to brokers in dual-write mode ) > KRaft controller should set inter broker listener when migration > > > Key: KAFKA-15670 > URL: https://issues.apache.org/jira/browse/KAFKA-15670 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.0 >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > > During ZK migrating to KRaft, before entering dual-write mode, the KRaft > controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, > and StopReplicaRequest) to the brokers. Currently, we use the inter broker > listener to send the RPC to brokers from the controller. Although these RPCs > are used for ZK brokers, in our case, the sender is actually KRaft > controller. In KRaft mode, the controller should talk with brokers via > `controller.listener.names`, not `inter.broker.listener.names`. > This issue can also be fixed by adding `inter.broker.listener.names` config > in KRaft controller configuration file, but it would be surprised that the > KRaft controller config should contain `inter.broker.listener.names`. > > The error while sending RPCs to brokers while the broker doesn't contain > PLAINTEXT listener. > {code:java} > [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled > error in SendRPCsToBrokersEvent > (org.apache.kafka.server.fault.LoggingFaultHandler) > kafka.common.BrokerEndPointNotAvailableException: End point with listener > name PLAINTEXT not found for broker 0 > at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) > at scala.Option.getOrElse(Option.scala:201) > at kafka.cluster.Broker.node(Broker.scala:93) > at > kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) > at > kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) > at > kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) > at > kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) > at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) > at > kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) > at > kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) > at > org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15670) KRaft controller using wrong listener to send RPC to brokers in dual-write mode
[ https://issues.apache.org/jira/browse/KAFKA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17778604#comment-17778604 ] Luke Chen commented on KAFKA-15670: --- [~mumrah][~cmccabe], any comment if I'm going to send RPCs to brokers via `controller.listener.names`? > KRaft controller using wrong listener to send RPC to brokers in dual-write > mode > > > Key: KAFKA-15670 > URL: https://issues.apache.org/jira/browse/KAFKA-15670 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.0 >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > > During ZK migrating to KRaft, before entering dual-write mode, the KRaft > controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, > and StopReplicaRequest) to the brokers. Currently, we use the inter broker > listener to send the RPC to brokers from the controller. Although these RPCs > are used for ZK brokers, in our case, the sender is actually KRaft > controller. In KRaft mode, the controller should talk with brokers via > `controller.listener.names`, not `inter.broker.listener.names`. > This issue can also be fixed by adding `inter.broker.listener.names` config > in KRaft controller configuration file, but it would be surprised that the > KRaft controller config should contain `inter.broker.listener.names`. > > The error while sending RPCs to brokers while the broker doesn't contain > PLAINTEXT listener. > {code:java} > [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled > error in SendRPCsToBrokersEvent > (org.apache.kafka.server.fault.LoggingFaultHandler) > kafka.common.BrokerEndPointNotAvailableException: End point with listener > name PLAINTEXT not found for broker 0 > at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) > at scala.Option.getOrElse(Option.scala:201) > at kafka.cluster.Broker.node(Broker.scala:93) > at > kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) > at > kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) > at > kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) > at > kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) > at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) > at > kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) > at > kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) > at > org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15670) KRaft controller using wrong listener to send RPC to brokers in dual-write mode
[ https://issues.apache.org/jira/browse/KAFKA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15670: -- Description: During ZK migrating to KRaft, before entering dual-write mode, the KRaft controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, and StopReplicaRequest) to the brokers. Currently, we use the inter broker listener to send the RPC to brokers from the controller. Although these RPCs are used for ZK brokers, in our case, the sender is actually KRaft controller. In KRaft mode, the controller should talk with brokers via `controller.listener.names`, not `inter.broker.listener.names`. This issue can also be fixed by adding `inter.broker.listener.names` config in KRaft controller configuration file, but it would be surprised that the KRaft controller config should contain `inter.broker.listener.names`. The error while sending RPCs to brokers while the broker doesn't contain PLAINTEXT listener. {code:java} [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled error in SendRPCsToBrokersEvent (org.apache.kafka.server.fault.LoggingFaultHandler) kafka.common.BrokerEndPointNotAvailableException: End point with listener name PLAINTEXT not found for broker 0 at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) at scala.Option.getOrElse(Option.scala:201) at kafka.cluster.Broker.node(Broker.scala:93) at kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) at kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) at kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) at org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) {code} was: During ZK migrating to KRaft, before entering dual-write mode, the KRaft controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, and StopReplicaRequest) to the brokers. Currently, we use the inter broker listener to send the RPC to brokers from the controller. Although these RPCs are used for ZK brokers, in our case, the sender is actually KRaft controller. In KRaft mode, the controller should talk with brokers via `controller.listener.names`, not `inter.broker.listener.names`. It would be surprised that the KRaft controller config should contain `inter.broker.listener.names`. {code:java} [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled error in SendRPCsToBrokersEvent (org.apache.kafka.server.fault.LoggingFaultHandler) kafka.common.BrokerEndPointNotAvailableException: End point with listener name PLAINTEXT not found for broker 0 at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) at scala.Option.getOrElse(Option.scala:201) at kafka.cluster.Broker.node(Broker.scala:93) at kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) at kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) at kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) at org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) {code} > KRaft controller using wrong listener to send RPC to brokers in dual-write > mode > > > Key: KAFKA-15670 > URL: https://issues.apache.org/jira/browse/KAFKA-15670 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.0 >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > > During ZK migrating to KRaft, before entering dual-write mode, the KRaft > controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, > and StopReplicaRequest) to the brokers. Currently, we use the inter broker > listener to send the RPC to brokers from the controller. Alt
[jira] [Assigned] (KAFKA-15670) KRaft controller using wrong listener to send RPC to brokers in dual-write mode
[ https://issues.apache.org/jira/browse/KAFKA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen reassigned KAFKA-15670: - Assignee: Luke Chen > KRaft controller using wrong listener to send RPC to brokers in dual-write > mode > > > Key: KAFKA-15670 > URL: https://issues.apache.org/jira/browse/KAFKA-15670 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.6.0 >Reporter: Luke Chen >Assignee: Luke Chen >Priority: Major > > During ZK migrating to KRaft, before entering dual-write mode, the KRaft > controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, > and StopReplicaRequest) to the brokers. Currently, we use the inter broker > listener to send the RPC to brokers from the controller. Although these RPCs > are used for ZK brokers, in our case, the sender is actually KRaft > controller. In KRaft mode, the controller should talk with brokers via > `controller.listener.names`, not `inter.broker.listener.names`. It would be > surprised that the KRaft controller config should contain > `inter.broker.listener.names`. > > {code:java} > [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled > error in SendRPCsToBrokersEvent > (org.apache.kafka.server.fault.LoggingFaultHandler) > kafka.common.BrokerEndPointNotAvailableException: End point with listener > name PLAINTEXT not found for broker 0 > at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) > at scala.Option.getOrElse(Option.scala:201) > at kafka.cluster.Broker.node(Broker.scala:93) > at > kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) > at > kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) > at > kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) > at > kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) > at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) > at > kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) > at > kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) > at > org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15670) KRaft controller using wrong listener to send RPC to brokers in dual-write mode
Luke Chen created KAFKA-15670: - Summary: KRaft controller using wrong listener to send RPC to brokers in dual-write mode Key: KAFKA-15670 URL: https://issues.apache.org/jira/browse/KAFKA-15670 Project: Kafka Issue Type: Bug Affects Versions: 3.6.0 Reporter: Luke Chen During ZK migrating to KRaft, before entering dual-write mode, the KRaft controller will send RPCs (i.e. UpdateMetadataRequest, LeaderAndIsrRequest, and StopReplicaRequest) to the brokers. Currently, we use the inter broker listener to send the RPC to brokers from the controller. Although these RPCs are used for ZK brokers, in our case, the sender is actually KRaft controller. In KRaft mode, the controller should talk with brokers via `controller.listener.names`, not `inter.broker.listener.names`. It would be surprised that the KRaft controller config should contain `inter.broker.listener.names`. {code:java} [2023-10-23 17:12:36,788] ERROR Encountered zk migration fault: Unhandled error in SendRPCsToBrokersEvent (org.apache.kafka.server.fault.LoggingFaultHandler) kafka.common.BrokerEndPointNotAvailableException: End point with listener name PLAINTEXT not found for broker 0 at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:94) at scala.Option.getOrElse(Option.scala:201) at kafka.cluster.Broker.node(Broker.scala:93) at kafka.controller.ControllerChannelManager.addNewBroker(ControllerChannelManager.scala:122) at kafka.controller.ControllerChannelManager.addBroker(ControllerChannelManager.scala:105) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.$anonfun$publishMetadata$2$adapted(MigrationPropagator.scala:97) at scala.collection.immutable.Set$Set1.foreach(Set.scala:168) at kafka.migration.MigrationPropagator.publishMetadata(MigrationPropagator.scala:97) at kafka.migration.MigrationPropagator.sendRPCsToBrokersFromMetadataImage(MigrationPropagator.scala:217) at org.apache.kafka.metadata.migration.KRaftMigrationDriver$SendRPCsToBrokersEvent.run(KRaftMigrationDriver.java:723) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15667) preCheck the invalid configuration for tiered storage replication factor
Luke Chen created KAFKA-15667: - Summary: preCheck the invalid configuration for tiered storage replication factor Key: KAFKA-15667 URL: https://issues.apache.org/jira/browse/KAFKA-15667 Project: Kafka Issue Type: Improvement Components: Tiered-Storage Affects Versions: 3.6.0 Reporter: Luke Chen `remote.log.metadata.topic.replication.factor` is a config to set the Replication factor of remote log metadata topic. For the `min.insync.replicas`, we'll use the broker config. Today, if the `remote.log.metadata.topic.replication.factor` < `min.insync.replicas` value, everything still works until new remote log metadata records created. We should be able to identify it when broker startup to notify users to fix the invalid config. ref: https://kafka.apache.org/documentation/#remote_log_metadata_manager_remote.log.metadata.topic.replication.factor -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15566) Flaky tests in FetchRequestTest.scala in KRaft mode
[ https://issues.apache.org/jira/browse/KAFKA-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-15566. --- Fix Version/s: 3.7.0 Resolution: Fixed > Flaky tests in FetchRequestTest.scala in KRaft mode > --- > > Key: KAFKA-15566 > URL: https://issues.apache.org/jira/browse/KAFKA-15566 > Project: Kafka > Issue Type: Improvement >Reporter: Deng Ziming >Assignee: Gantigmaa Selenge >Priority: Major > Labels: flaky-test > Fix For: 3.7.0 > > > |[https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14295/4/#showFailuresLink] > [Build / JDK 11 and Scala 2.13 / > kafka.server.FetchRequestTest.testLastFetchedEpochValidation(String).quorum=kraft|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14295/4/testReport/junit/kafka.server/FetchRequestTest/Build___JDK_11_and_Scala_2_13___testLastFetchedEpochValidation_String__quorum_kraft/] > [Build / JDK 11 and Scala 2.13 / > kafka.server.FetchRequestTest.testLastFetchedEpochValidationV12(String).quorum=kraft|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14295/4/testReport/junit/kafka.server/FetchRequestTest/Build___JDK_11_and_Scala_2_13___testLastFetchedEpochValidationV12_String__quorum_kraft/] > [Build / JDK 11 and Scala 2.13 / > kafka.server.FetchRequestTest.testFetchWithPartitionsWithIdError(String).quorum=kraft|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14295/4/testReport/junit/kafka.server/FetchRequestTest/Build___JDK_11_and_Scala_2_13___testFetchWithPartitionsWithIdError_String__quorum_kraft_2/] > [Build / JDK 11 and Scala 2.13 / > kafka.server.FetchRequestTest.testLastFetchedEpochValidation(String).quorum=kraft|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14295/4/testReport/junit/kafka.server/FetchRequestTest/Build___JDK_11_and_Scala_2_13___testLastFetchedEpochValidation_String__quorum_kraft_2/] > [Build / JDK 11 and Scala 2.13 / > kafka.server.FetchRequestTest.testLastFetchedEpochValidationV12(String).quorum=kraft|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-14295/4/testReport/junit/kafka.server/FetchRequestTest/Build___JDK_11_and_Scala_2_13___testLastFetchedEpochValidationV12_String__quorum_kraft_2/]| > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-15106) AbstractStickyAssignor may stuck in 3.5
[ https://issues.apache.org/jira/browse/KAFKA-15106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen updated KAFKA-15106: -- Fix Version/s: 3.5.2 > AbstractStickyAssignor may stuck in 3.5 > --- > > Key: KAFKA-15106 > URL: https://issues.apache.org/jira/browse/KAFKA-15106 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 3.5.0 >Reporter: li xiangyuan >Assignee: li xiangyuan >Priority: Major > Fix For: 3.6.0, 3.5.2 > > > this could reproduce in ut easy, > just int > org.apache.kafka.clients.consumer.internals.AbstractStickyAssignorTest#testLargeAssignmentAndGroupWithNonEqualSubscription, > plz set > partitionCount=200, > consumerCount=20, you can see > isBalanced will return false forever. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15609) Corrupted index uploaded to remote tier
[ https://issues.apache.org/jira/browse/KAFKA-15609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1171#comment-1171 ] Luke Chen commented on KAFKA-15609: --- [~divijvaidya], do you know where the index file corrupted? In the end of the file? Do you have any more info on this if we now think that's not related to the "unflushed" data. I just had a test to delay flush data when rolling the segments. And during the delay, making sure the log segment and index files are uploaded to remote storage. Under this case, client can still consume the data correctly. So, I'm wondering what we're going to do with this ticket. More info about the issue should help. If no, I think we can close this and re-open it if we encounter similar issue and have more info. WDYT? > Corrupted index uploaded to remote tier > --- > > Key: KAFKA-15609 > URL: https://issues.apache.org/jira/browse/KAFKA-15609 > Project: Kafka > Issue Type: Bug > Components: Tiered-Storage >Affects Versions: 3.6.0 >Reporter: Divij Vaidya >Priority: Minor > > While testing Tiered Storage, we have observed corrupt indexes being present > in remote tier. One such situation is covered here at > https://issues.apache.org/jira/browse/KAFKA-15401. This Jira presents another > such possible case of corruption. > Potential cause of index corruption: > We want to ensure that the file we are passing to RSM plugin contains all the > data which is present in MemoryByteBuffer i.e. we should have flushed the > MemoryByteBuffer to the file using force(). In Kafka, when we close a > segment, indexes are flushed asynchronously [1]. Hence, it might be possible > that when we are passing the file to RSM, the file doesn't contain flushed > data. Hence, we may end up uploading indexes which haven't been flushed yet. > Ideally, the contract should enforce that we force flush the content of > MemoryByteBuffer before we give the file for RSM. This will ensure that > indexes are not corrupted/incomplete. > [1] > [https://github.com/apache/kafka/blob/4150595b0a2e0f45f2827cebc60bcb6f6558745d/core/src/main/scala/kafka/log/UnifiedLog.scala#L1613] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15620) Duplicate remote log DELETE_SEGMENT metadata is generated when there are multiple leader epochs in the segment
[ https://issues.apache.org/jira/browse/KAFKA-15620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Chen resolved KAFKA-15620. --- Fix Version/s: 3.7.0 Resolution: Duplicate > Duplicate remote log DELETE_SEGMENT metadata is generated when there are > multiple leader epochs in the segment > -- > > Key: KAFKA-15620 > URL: https://issues.apache.org/jira/browse/KAFKA-15620 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 3.6.0 >Reporter: Henry Cai >Priority: Major > Fix For: 3.7.0, 3.6.1 > > > Use the newly released 3.6.0, turn on tiered storage feature: > {*}remote.log.storage.system.enable{*}=true > 1. Set up topic tier5 to be remote storage enabled. Adding some data to the > topic and the data is copied to remote storage. After a few days when the > log segment is removed from remote storage due to log retention expiration, > noticed the following warnings in the server log: > [2023-10-16 22:19:10,971] DEBUG Updating remote log segment metadata: > [RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId > {topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} > , customMetadata=Optional.empty, state=DELETE_SEGMENT_STARTED, > eventTimestampMs=1697005926358, brokerId=1043}] > (org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache) > [2023-10-16 22:19:10,971] WARN Error occurred while updating the remote log > segment. > (org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore) > org.apache.kafka.server.log.remote.storage.RemoteResourceNotFoundException: > No remote log segment metadata found for > :RemoteLogSegmentId\{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, > id=YFNCaWjPQFSKCngQ1QcKpA} > at > org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache.updateRemoteLogSegmentMetadata(RemoteLogMetadataCache.java:161) > at > org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataStore.handleRemoteLogSegmentMetadataUpdate(RemotePartitionMetadataStore.java:89) > at > org.apache.kafka.server.log.remote.metadata.storage.RemotePartitionMetadataEventHandler.handleRemoteLogMetadata(RemotePartitionMetadataEventHandler.java:33) > at > org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask.processConsumerRecord(ConsumerTask.java:157) > at > org.apache.kafka.server.log.remote.metadata.storage.ConsumerTask.run(ConsumerTask.java:133) > at java.base/java.lang.Thread.run(Thread.java:829) > > 2. After some debugging, realized the problem is *there are 2 sets of > DELETE_SEGMENT_STARTED/FINISHED pairs* in the remote metadata topic for this > segment. The DELETE_SEGMENT_FINISHED in the first set remove the segment > from the metadata cache and this caused the above exception when the > DELETE_SEGMENT_STARTED from the second set needs to be processed. > > 3. And traced the log to where the log retention kicked in and saw *there > were two delete log segment generated* at that time: > ``` > [2023-10-10 23:32:05,929] INFO [RemoteLogManager=1043 > partition=QelVeVmER5CkjrzIiF07PQ:tier5-0] About to delete remote log segment > RemoteLogSegmentId{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, > id={*}YFNCaWjPQFSKCngQ1QcKpA{*}} due to retention time 60480ms breach > based on the largest record timestamp in the segment > (kafka.log.remote.RemoteLogManager$RLMTask) > [2023-10-10 23:32:05,929] INFO [RemoteLogManager=1043 > partition=QelVeVmER5CkjrzIiF07PQ:tier5-0] About to delete remote log segment > RemoteLogSegmentId{topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, > id={*}YFNCaWjPQFSKCngQ1QcKpA{*}} due to retention time 60480ms breach > based on the largest record timestamp in the segment > (kafka.log.remote.RemoteLogManager$RLMTask) > ``` > 4. And dumped out the content of the original COPY_SEGMENT_STARTED for this > segment (which triggers the generation of the later DELETE_SEGMENT metadata): > [2023-10-16 22:19:10,894] DEBUG Adding to in-progress state: > [RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId > {topicIdPartition=QelVeVmER5CkjrzIiF07PQ:tier5-0, id=YFNCaWjPQFSKCngQ1QcKpA} > , startOffset=6387830, endOffset=9578660, brokerId=1043, > maxTimestampMs=1696401123036, eventTimestampMs=1696401127062, > segmentLeaderEpochs=\{5=6387830, 6=6721329}, segmentSizeInBytes=511987531, > customMetadata=Optional.empty, state=COPY_SEGMENT_STARTED}] > (org.apache.kafka.server.log.remote.metadata.storage.RemoteLogMetadataCache) > > You can see there are 2 leader epochs in this segment: > *segmentLeaderEpochs=\{5=6387830, 6=6721329}* > > 5. From the remote log retention code > ([https://githu