[jira] [Resolved] (KAFKA-13910) Test metadata refresh for Kraft admin client
[ https://issues.apache.org/jira/browse/KAFKA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13910. Resolution: Won't Fix > Test metadata refresh for Kraft admin client > > > Key: KAFKA-13910 > URL: https://issues.apache.org/jira/browse/KAFKA-13910 > Project: Kafka > Issue Type: Test >Reporter: dengziming >Priority: Minor > Labels: newbie > > [https://github.com/apache/kafka/pull/12110#discussion_r875418603] > currently we don't get the real controller from MetadtaCache in KRaft mode, > we should test it in another way -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14152) Add logic to fence kraft brokers which have fallen behind in replication
[ https://issues.apache.org/jira/browse/KAFKA-14152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603403#comment-17603403 ] dengziming commented on KAFKA-14152: > it must catch up to the current metadata before it is unfenced. Currently, we have changed the behavior to unfence a broker when it catch up to it's own RegisterBrokerRecord. > Add logic to fence kraft brokers which have fallen behind in replication > > > Key: KAFKA-14152 > URL: https://issues.apache.org/jira/browse/KAFKA-14152 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Priority: Major > > When a kraft broker registers with the controller, it must catch up to the > current metadata before it is unfenced. However, once it has been unfenced, > it only needs to continue sending heartbeats to remain unfenced. It can fall > arbitrarily behind in the replication of the metadata log and remain > unfenced. We should consider whether there is an inverse condition that we > can use to fence a broker that has fallen behind. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14210) client quota config key is sanitized in Kraft broker
[ https://issues.apache.org/jira/browse/KAFKA-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming updated KAFKA-14210: --- Description: we sanitized the key in zk mode but don't sanitized in Kraft mode. {code:java} public class AdminClientExample { public static void main(String[] args) throws ExecutionException, InterruptedException { Properties properties = new Properties(); properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092"); Admin admin = Admin.create(properties); // Alter config admin.alterClientQuotas(Collections.singleton( new ClientQuotaAlteration( new ClientQuotaEntity(Collections.singletonMap(ClientQuotaEntity.CLIENT_ID, "")), Collections.singletonList(new ClientQuotaAlteration.Op("request_percentage", 0.02))) )).all().get(); Map> clientQuotaEntityMapMap = admin.describeClientQuotas( ClientQuotaFilter.contains(Collections.singletonList(ClientQuotaFilterComponent.ofDefaultEntity(ClientQuotaEntity.CLIENT_ID))) ).entities().get(); System.out.println(clientQuotaEntityMapMap); } } {code} The code should have return request_percentage=0.02, but it returned Map.empty. was: we sanitized the key in zk mode but don't sanitized in Kraft mode. {code:java} public class AdminClientExample { public static void main(String[] args) throws ExecutionException, InterruptedException { Properties properties = new Properties(); properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092"); Admin admin = Admin.create(properties); // Alter config admin.alterClientQuotas(Collections.singleton( new ClientQuotaAlteration( new ClientQuotaEntity(Collections.singletonMap(ClientQuotaEntity.CLIENT_ID, "default")), Collections.singletonList(new ClientQuotaAlteration.Op("request_percentage", 0.02))) )).all().get(); Map> clientQuotaEntityMapMap = admin.describeClientQuotas( ClientQuotaFilter.contains(Collections.singletonList(ClientQuotaFilterComponent.ofDefaultEntity(ClientQuotaEntity.CLIENT_ID))) ).entities().get(); System.out.println(clientQuotaEntityMapMap); } } {code} The code should have return request_percentage=0.02, but it returned Map.empty. > client quota config key is sanitized in Kraft broker > -- > > Key: KAFKA-14210 > URL: https://issues.apache.org/jira/browse/KAFKA-14210 > Project: Kafka > Issue Type: Improvement >Reporter: dengziming >Assignee: dengziming >Priority: Major > > we sanitized the key in zk mode but don't sanitized in Kraft mode. > {code:java} > public class AdminClientExample { > public static void main(String[] args) throws ExecutionException, > InterruptedException { > Properties properties = new Properties(); > properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, > "localhost:9092"); > Admin admin = Admin.create(properties); > // Alter config > admin.alterClientQuotas(Collections.singleton( > new ClientQuotaAlteration( > new > ClientQuotaEntity(Collections.singletonMap(ClientQuotaEntity.CLIENT_ID, > "")), > Collections.singletonList(new > ClientQuotaAlteration.Op("request_percentage", 0.02))) > )).all().get(); > Map> clientQuotaEntityMapMap = > admin.describeClientQuotas( > > ClientQuotaFilter.contains(Collections.singletonList(ClientQuotaFilterComponent.ofDefaultEntity(ClientQuotaEntity.CLIENT_ID))) > ).entities().get(); > System.out.println(clientQuotaEntityMapMap); > } > } {code} > The code should have return request_percentage=0.02, but it returned > Map.empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14210) client quota config key is sanitized in Kraft broker
[ https://issues.apache.org/jira/browse/KAFKA-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-14210. Resolution: Not A Problem > client quota config key is sanitized in Kraft broker > -- > > Key: KAFKA-14210 > URL: https://issues.apache.org/jira/browse/KAFKA-14210 > Project: Kafka > Issue Type: Improvement >Reporter: dengziming >Assignee: dengziming >Priority: Major > > we sanitized the key in zk mode but don't sanitized in Kraft mode. > {code:java} > public class AdminClientExample { > public static void main(String[] args) throws ExecutionException, > InterruptedException { > Properties properties = new Properties(); > properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, > "localhost:9092"); > Admin admin = Admin.create(properties); > // Alter config > admin.alterClientQuotas(Collections.singleton( > new ClientQuotaAlteration( > new > ClientQuotaEntity(Collections.singletonMap(ClientQuotaEntity.CLIENT_ID, > "default")), > Collections.singletonList(new > ClientQuotaAlteration.Op("request_percentage", 0.02))) > )).all().get(); > Map> clientQuotaEntityMapMap = > admin.describeClientQuotas( > > ClientQuotaFilter.contains(Collections.singletonList(ClientQuotaFilterComponent.ofDefaultEntity(ClientQuotaEntity.CLIENT_ID))) > ).entities().get(); > System.out.println(clientQuotaEntityMapMap); > } > } {code} > The code should have return request_percentage=0.02, but it returned > Map.empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14210) client quota config key is sanitized in Kraft broker
[ https://issues.apache.org/jira/browse/KAFKA-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming updated KAFKA-14210: --- Description: we sanitized the key in zk mode but don't sanitized in Kraft mode. {code:java} public class AdminClientExample { public static void main(String[] args) throws ExecutionException, InterruptedException { Properties properties = new Properties(); properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092"); Admin admin = Admin.create(properties); // Alter config admin.alterClientQuotas(Collections.singleton( new ClientQuotaAlteration( new ClientQuotaEntity(Collections.singletonMap(ClientQuotaEntity.CLIENT_ID, "default")), Collections.singletonList(new ClientQuotaAlteration.Op("request_percentage", 0.02))) )).all().get(); Map> clientQuotaEntityMapMap = admin.describeClientQuotas( ClientQuotaFilter.contains(Collections.singletonList(ClientQuotaFilterComponent.ofDefaultEntity(ClientQuotaEntity.CLIENT_ID))) ).entities().get(); System.out.println(clientQuotaEntityMapMap); } } {code} The code should have return request_percentage=0.02, but it returned Map.empty. was:we sanitized the key in zk mode but don't sanitized in Kraft mode. > client quota config key is sanitized in Kraft broker > -- > > Key: KAFKA-14210 > URL: https://issues.apache.org/jira/browse/KAFKA-14210 > Project: Kafka > Issue Type: Improvement >Reporter: dengziming >Assignee: dengziming >Priority: Major > > we sanitized the key in zk mode but don't sanitized in Kraft mode. > {code:java} > public class AdminClientExample { > public static void main(String[] args) throws ExecutionException, > InterruptedException { > Properties properties = new Properties(); > properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, > "localhost:9092"); > Admin admin = Admin.create(properties); > // Alter config > admin.alterClientQuotas(Collections.singleton( > new ClientQuotaAlteration( > new > ClientQuotaEntity(Collections.singletonMap(ClientQuotaEntity.CLIENT_ID, > "default")), > Collections.singletonList(new > ClientQuotaAlteration.Op("request_percentage", 0.02))) > )).all().get(); > Map> clientQuotaEntityMapMap = > admin.describeClientQuotas( > > ClientQuotaFilter.contains(Collections.singletonList(ClientQuotaFilterComponent.ofDefaultEntity(ClientQuotaEntity.CLIENT_ID))) > ).entities().get(); > System.out.println(clientQuotaEntityMapMap); > } > } {code} > The code should have return request_percentage=0.02, but it returned > Map.empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14210) client quota config key is sanitized in Kraft broker
[ https://issues.apache.org/jira/browse/KAFKA-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming updated KAFKA-14210: --- Description: we sanitized the key in zk mode but don't sanitized in Kraft mode. (was: Update client quota default config successfully: root@7604c498c154:/opt/kafka-3.2.0# bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --entity-type clients --entity-default --add-config REQUEST_PERCENTAGE_OVERRIDE_CONFIG=0.01 Describe client quota default config, but there is no output: Only quota configs can be added for 'clients' using --bootstrap-server. Unexpected config names: Set(REQUEST_PERCENTAGE_OVERRIDE_CONFIG) root@7604c498c154:/opt/kafka-3.2.0# bin/kafka-configs.sh --bootstrap-server localhost:9092 --describe --entity-type clients --entity-default The reason is that we sanitized the key in zk mode but don't sanitized in Kraft mode.) > client quota config key is sanitized in Kraft broker > -- > > Key: KAFKA-14210 > URL: https://issues.apache.org/jira/browse/KAFKA-14210 > Project: Kafka > Issue Type: Improvement >Reporter: dengziming >Assignee: dengziming >Priority: Major > > we sanitized the key in zk mode but don't sanitized in Kraft mode. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14210) client quota config key is sanitized in Kraft broker
dengziming created KAFKA-14210: -- Summary: client quota config key is sanitized in Kraft broker Key: KAFKA-14210 URL: https://issues.apache.org/jira/browse/KAFKA-14210 Project: Kafka Issue Type: Improvement Reporter: dengziming Assignee: dengziming Update client quota default config successfully: root@7604c498c154:/opt/kafka-3.2.0# bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --entity-type clients --entity-default --add-config REQUEST_PERCENTAGE_OVERRIDE_CONFIG=0.01 Describe client quota default config, but there is no output: Only quota configs can be added for 'clients' using --bootstrap-server. Unexpected config names: Set(REQUEST_PERCENTAGE_OVERRIDE_CONFIG) root@7604c498c154:/opt/kafka-3.2.0# bin/kafka-configs.sh --bootstrap-server localhost:9092 --describe --entity-type clients --entity-default The reason is that we sanitized the key in zk mode but don't sanitized in Kraft mode. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14197) Kraft broker fails to startup after topic creation failure
[ https://issues.apache.org/jira/browse/KAFKA-14197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17600279#comment-17600279 ] dengziming commented on KAFKA-14197: Basically, a record will be persisted after being applied, it is similar to a transaction. if there is an unexpected exception between persisting and applying we should call `QuorumController.renounce` to revert the memory state when calling`snapshotRegistry.revertToSnapshot(lastCommittedOffset)`. maybe there are some edge cases when we fail to revert memory states? > Kraft broker fails to startup after topic creation failure > -- > > Key: KAFKA-14197 > URL: https://issues.apache.org/jira/browse/KAFKA-14197 > Project: Kafka > Issue Type: Bug > Components: kraft >Reporter: Luke Chen >Priority: Major > > In kraft ControllerWriteEvent, we start by trying to apply the record to > controller in-memory state, then sent out the record via raft client. But if > there is error during sending the records, there's no way to revert the > change to controller in-memory state[1]. > The issue happened when creating topics, controller state is updated with > topic and partition metadata (ex: broker to ISR map), but the record doesn't > send out successfully (ex: buffer allocation error). Then, when shutting down > the node, the controlled shutdown will try to remove the broker from ISR > by[2]: > {code:java} > generateLeaderAndIsrUpdates("enterControlledShutdown[" + brokerId + "]", > brokerId, NO_LEADER, records, > brokersToIsrs.partitionsWithBrokerInIsr(brokerId));{code} > > After we appending the partitionChangeRecords, and send to metadata topic > successfully, it'll cause the brokers failed to "replay" these partition > change since these topic/partitions didn't get created successfully > previously. > Even worse, after restarting the node, all the metadata records will replay > again, and the same error happened again, cause the broker cannot start up > successfully. > > The error and call stack is like this, basically, it complains the topic > image can't be found > {code:java} > [2022-09-02 16:29:16,334] ERROR Encountered metadata loading fault: Error > replaying metadata log record at offset 81 > (org.apache.kafka.server.fault.LoggingFaultHandler) > java.lang.NullPointerException > at org.apache.kafka.image.TopicDelta.replay(TopicDelta.java:69) > at org.apache.kafka.image.TopicsDelta.replay(TopicsDelta.java:91) > at org.apache.kafka.image.MetadataDelta.replay(MetadataDelta.java:248) > at org.apache.kafka.image.MetadataDelta.replay(MetadataDelta.java:186) > at > kafka.server.metadata.BrokerMetadataListener.$anonfun$loadBatches$3(BrokerMetadataListener.scala:239) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) > at > kafka.server.metadata.BrokerMetadataListener.kafka$server$metadata$BrokerMetadataListener$$loadBatches(BrokerMetadataListener.scala:232) > at > kafka.server.metadata.BrokerMetadataListener$HandleCommitsEvent.run(BrokerMetadataListener.scala:113) > at > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173) > at java.base/java.lang.Thread.run(Thread.java:829) > {code} > > [1] > [https://github.com/apache/kafka/blob/ef65b6e566ef69b2f9b58038c98a5993563d7a68/metadata/src/main/java/org/apache/kafka/controller/QuorumController.java#L779-L804] > > [2] > [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L1270] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13990) Update features will fail in KRaft mode
[ https://issues.apache.org/jira/browse/KAFKA-13990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13990. Resolution: Fixed > Update features will fail in KRaft mode > --- > > Key: KAFKA-13990 > URL: https://issues.apache.org/jira/browse/KAFKA-13990 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: dengziming >Assignee: dengziming >Priority: Blocker > Fix For: 3.3.0 > > > We return empty supported features in Controller ApiVersionResponse, so the > {{quorumSupportedFeature}} will always return empty, we should return > Map(metadata.version -> latest) > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13850) kafka-metadata-shell is missing some record types
[ https://issues.apache.org/jira/browse/KAFKA-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13850. Fix Version/s: 3.3 Resolution: Fixed > kafka-metadata-shell is missing some record types > - > > Key: KAFKA-13850 > URL: https://issues.apache.org/jira/browse/KAFKA-13850 > Project: Kafka > Issue Type: Bug > Components: kraft >Reporter: David Arthur >Assignee: dengziming >Priority: Major > Fix For: 3.3 > > > Noticed while working on feature flags in KRaft, the in-memory tree of the > metadata (MetadataNodeManager) is missing support for a few of record types. > * DelegationTokenRecord > * UserScramCredentialRecord (should we include this?) > * FeatureLevelRecord > * AccessControlEntryRecord > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-13991) Add Admin.updateFeatures() API.
[ https://issues.apache.org/jira/browse/KAFKA-13991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13991: -- Assignee: dengziming > Add Admin.updateFeatures() API. > --- > > Key: KAFKA-13991 > URL: https://issues.apache.org/jira/browse/KAFKA-13991 > Project: Kafka > Issue Type: Bug >Reporter: dengziming >Assignee: dengziming >Priority: Minor > Labels: kip-required > > We have Admin.updateFeatures(Map, > UpdateFeaturesOptions) but don't have Admin.updateFeatures(Map FeatureUpdate>), this is inconsistent with other Admin API. > We may need a KIP to change this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14073) Logging the reason for creating a snapshot
dengziming created KAFKA-14073: -- Summary: Logging the reason for creating a snapshot Key: KAFKA-14073 URL: https://issues.apache.org/jira/browse/KAFKA-14073 Project: Kafka Issue Type: Improvement Reporter: dengziming So far we have two reasons for creating a snapshot. 1. X bytes were applied. 2. the metadata version changed. we should log the reason when creating snapshot both in the broker side and controller side. see https://github.com/apache/kafka/pull/12265#discussion_r915972383 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14047) Use KafkaRaftManager in KRaft TestKit
dengziming created KAFKA-14047: -- Summary: Use KafkaRaftManager in KRaft TestKit Key: KAFKA-14047 URL: https://issues.apache.org/jira/browse/KAFKA-14047 Project: Kafka Issue Type: Test Reporter: dengziming We are using lower-level {{ControllerServer}} and {{BrokerServer}} in TestKit, we can improve it to use KafkaRaftManager. see the discussion here: https://github.com/apache/kafka/pull/12157#discussion_r882179407 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-13914) Implement kafka-metadata-quorum.sh
[ https://issues.apache.org/jira/browse/KAFKA-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13914: -- Assignee: dengziming > Implement kafka-metadata-quorum.sh > -- > > Key: KAFKA-13914 > URL: https://issues.apache.org/jira/browse/KAFKA-13914 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: dengziming >Priority: Major > > KIP-595 documents a tool for describing quorum status > `kafka-metadata-quorum.sh`: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum#KIP595:ARaftProtocolfortheMetadataQuorum-ToolingSupport.] > We need to implement this. > Note that this depends on the Admin API for `DescribeQuorum`, which is > proposed in KIP-836: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-836%3A+Addition+of+Information+in+DescribeQuorumResponse+about+Voter+Lag.] > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13991) Add Admin.updateFeatures() API.
dengziming created KAFKA-13991: -- Summary: Add Admin.updateFeatures() API. Key: KAFKA-13991 URL: https://issues.apache.org/jira/browse/KAFKA-13991 Project: Kafka Issue Type: Bug Reporter: dengziming We have Admin.updateFeatures(Map, UpdateFeaturesOptions) but don't have Admin.updateFeatures(Map), this is inconsistent with other Admin API. We may need a KIP to change this. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13990) Update features will fail in KRaft mode
dengziming created KAFKA-13990: -- Summary: Update features will fail in KRaft mode Key: KAFKA-13990 URL: https://issues.apache.org/jira/browse/KAFKA-13990 Project: Kafka Issue Type: Bug Reporter: dengziming Assignee: dengziming We return empty supported features in Controller ApiVersionResponse, so the {{quorumSupportedFeature}} will always return empty, we should return Map(metadata.version -> latest) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-7342) Migrate streams modules to JUnit 5
[ https://issues.apache.org/jira/browse/KAFKA-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552056#comment-17552056 ] dengziming commented on KAFKA-7342: --- [~christo_lolov] Thank you for doing this. > Migrate streams modules to JUnit 5 > -- > > Key: KAFKA-7342 > URL: https://issues.apache.org/jira/browse/KAFKA-7342 > Project: Kafka > Issue Type: Sub-task > Components: streams, unit tests >Reporter: Ismael Juma >Assignee: Chia-Ping Tsai >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13959) Controller should unfence Broker with busy metadata log
[ https://issues.apache.org/jira/browse/KAFKA-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551534#comment-17551534 ] dengziming commented on KAFKA-13959: I haven't find the root cause, I just print the brokerOffset and controllerOffset when heartbeat, I find that every time the brokerOffset bump, the controllerOffset will also bump. ``` time: 1654679131904 broker 0 brokerOffset:27 controllerOffset:28 time: 1654679132115 broker 0 brokerOffset:27 controllerOffset:28 time: 1654679132381 broker 0 brokerOffset:28 controllerOffset:29 time: 1654679132592 broker 0 brokerOffset:28 controllerOffset:29 time: 1654679132878 broker 0 brokerOffset:29 controllerOffset:30 time: 1654679133089 broker 0 brokerOffset:29 controllerOffset:30 time: 1654679133299 broker 0 brokerOffset:30 controllerOffset:31 time: 1654679133509 broker 0 brokerOffset:30 controllerOffset:31 ``` I try to increase the interval of heartbeats but got the same result, and if I set numberControllerNodes to 1, this problem disappear. I think this may be related to the logic of how we compute leader hw and follower hw. > Controller should unfence Broker with busy metadata log > --- > > Key: KAFKA-13959 > URL: https://issues.apache.org/jira/browse/KAFKA-13959 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.3.0 >Reporter: Jose Armando Garcia Sancio >Priority: Blocker > > https://issues.apache.org/jira/browse/KAFKA-13955 showed that it is possible > for the controller to not unfence a broker if the committed offset keeps > increasing. > > One solution to this problem is to require the broker to only catch up to the > last committed offset when they last sent the heartbeat. For example: > # Broker sends a heartbeat with current offset of {{{}Y{}}}. The last commit > offset is {{{}X{}}}. The controller remember this last commit offset, call it > {{X'}} > # Broker sends another heartbeat with current offset of {{{}Z{}}}. Unfence > the broker if {{Z >= X}} or {{{}Z >= X'{}}}. > > This change should also set the default for MetadataMaxIdleIntervalMs back to > 500. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13968) Broker should not generator snapshot until been unfenced
dengziming created KAFKA-13968: -- Summary: Broker should not generator snapshot until been unfenced Key: KAFKA-13968 URL: https://issues.apache.org/jira/browse/KAFKA-13968 Project: Kafka Issue Type: Bug Components: kraft Reporter: dengziming Assignee: dengziming There is a bug when computing `FeaturesDelta` which cause us to generate snapshot on every commit. [2022-06-08 13:07:43,010] INFO [BrokerMetadataSnapshotter id=0] Creating a new snapshot at offset 0... (kafka.server.metadata.BrokerMetadataSnapshotter:66) [2022-06-08 13:07:43,222] INFO [BrokerMetadataSnapshotter id=0] Creating a new snapshot at offset 2... (kafka.server.metadata.BrokerMetadataSnapshotter:66) [2022-06-08 13:07:43,727] INFO [BrokerMetadataSnapshotter id=0] Creating a new snapshot at offset 3... (kafka.server.metadata.BrokerMetadataSnapshotter:66) [2022-06-08 13:07:44,228] INFO [BrokerMetadataSnapshotter id=0] Creating a new snapshot at offset 4... (kafka.server.metadata.BrokerMetadataSnapshotter:66) Before a broker being unfenced, it won't starting publishing metadata, so it's meaningless to generate a snapshot. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13959) Controller should unfence Broker with busy metadata log
[ https://issues.apache.org/jira/browse/KAFKA-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550912#comment-17550912 ] dengziming commented on KAFKA-13959: When BrokerLifecycleManager is starting up, it will send heartbeat every 10 milliseconds rather than 2000 milliseconds: `scheduleNextCommunication(NANOSECONDS.convert(10, MILLISECONDS))` which is already smaller than 500ms, so the reason for this bug is more complex, I need more time to investigate. > Controller should unfence Broker with busy metadata log > --- > > Key: KAFKA-13959 > URL: https://issues.apache.org/jira/browse/KAFKA-13959 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.3.0 >Reporter: Jose Armando Garcia Sancio >Priority: Blocker > > https://issues.apache.org/jira/browse/KAFKA-13955 showed that it is possible > for the controller to not unfence a broker if the committed offset keeps > increasing. > > One solution to this problem is to require the broker to only catch up to the > last committed offset when they last sent the heartbeat. For example: > # Broker sends a heartbeat with current offset of {{{}Y{}}}. The last commit > offset is {{{}X{}}}. The controller remember this last commit offset, call it > {{X'}} > # Broker sends another heartbeat with current offset of {{{}Z{}}}. Unfence > the broker if {{Z >= X}} or {{{}Z >= X'{}}}. > > This change should also set the default for MetadataMaxIdleIntervalMs back to > 500. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13959) Controller should unfence Broker with busy metadata log
[ https://issues.apache.org/jira/browse/KAFKA-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17548215#comment-17548215 ] dengziming commented on KAFKA-13959: I will take a look at this if no one assigned. > Controller should unfence Broker with busy metadata log > --- > > Key: KAFKA-13959 > URL: https://issues.apache.org/jira/browse/KAFKA-13959 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.3.0 >Reporter: Jose Armando Garcia Sancio >Priority: Blocker > > https://issues.apache.org/jira/browse/KAFKA-13955 showed that it is possible > for the controller to not unfence a broker if the committed offset keeps > increasing. > > One solution to this problem is to require the broker to only catch up to the > last committed offset when they last sent the heartbeat. For example: > # Broker sends a heartbeat with current offset of {{{}Y{}}}. The last commit > offset is {{{}X{}}}. The controller remember this last commit offset, call it > {{X'}} > # Broker sends another heartbeat with current offset of {{{}Z{}}}. Unfence > the broker if {{Z >= X}} or {{{}Z >= X'{}}}. > > This change should also set the default for MetadataMaxIdleIntervalMs back to > 500. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-13845) Add support for reading KRaft snapshots in kafka-dump-log
[ https://issues.apache.org/jira/browse/KAFKA-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13845. Resolution: Fixed > Add support for reading KRaft snapshots in kafka-dump-log > - > > Key: KAFKA-13845 > URL: https://issues.apache.org/jira/browse/KAFKA-13845 > Project: Kafka > Issue Type: Sub-task >Reporter: David Arthur >Assignee: dengziming >Priority: Minor > Labels: kip-500 > > Even though the metadata snapshots use the same format as log segments, the > kafka-dump-log tool (DumpLogSegments.scala) does not support the file > extension or file name pattern. > For example, a metadata snapshot will be named like: > {code:java} > __cluster_metadata-0/0004-01.checkpoint{code} > whereas regular log segments (including the metadata log) are named like: > {code:java} > __cluster_metadata-0/.log {code} > > We need to enhance the tool to support snapshots. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-13833) Remove the "min_version_level" version from the finalized version range that is written to ZooKeeper
[ https://issues.apache.org/jira/browse/KAFKA-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13833. Resolution: Fixed > Remove the "min_version_level" version from the finalized version range that > is written to ZooKeeper > > > Key: KAFKA-13833 > URL: https://issues.apache.org/jira/browse/KAFKA-13833 > Project: Kafka > Issue Type: Sub-task >Reporter: dengziming >Assignee: dengziming >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-12902) Add UINT32 type in generator
[ https://issues.apache.org/jira/browse/KAFKA-12902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-12902. Resolution: Fixed > Add UINT32 type in generator > > > Key: KAFKA-12902 > URL: https://issues.apache.org/jira/browse/KAFKA-12902 > Project: Kafka > Issue Type: Improvement >Reporter: dengziming >Assignee: dengziming >Priority: Major > > We support unit32 in Struct but don't support unit32 in generator protocol. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13718) kafka-topics describe topic with default config will show `segment.bytes` overridden config
[ https://issues.apache.org/jira/browse/KAFKA-13718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539137#comment-17539137 ] dengziming commented on KAFKA-13718: Hello [~rjoerger] # this issue is enough to track the problem so another issue is unnecessary # We are not intend to include default `segment.bytes` in the output but it was printed(due to an bug), but this bug has been around for a long time so [~mimaison] suggest we keep this bug in the future(make it bug-compatible) # Since we are trying to keep its unchanged, so KIP is unnecessary # This is the source of the bug, currently we haven't find it, you can investigate it and add a comment to it to explain why we should keep this bug. > kafka-topics describe topic with default config will show `segment.bytes` > overridden config > > > Key: KAFKA-13718 > URL: https://issues.apache.org/jira/browse/KAFKA-13718 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 3.1.0, 2.8.1, 3.0.0 >Reporter: Luke Chen >Priority: Major > Labels: newbie, newbie++ > > Following the quickstart guide[1], when describing the topic just created > with default config, I found there's a overridden config shown: > _> bin/kafka-topics.sh --describe --topic quickstart-events > --bootstrap-server localhost:9092_ > _Topic: quickstart-events TopicId: 06zRrzDCRceR9zWAf_BUWQ > PartitionCount: 1 ReplicationFactor: 1 *Configs: > segment.bytes=1073741824*_ > _Topic: quickstart-events Partition: 0 Leader: 0 Replicas: 0 > Isr: 0_ > > This config result should be empty as in Kafka quick start page. Although the > config value is what we expected (default 1GB value), this info display still > confuse users. > > Note: I checked the 2.8.1 build, this issue also happened. > > [1]: [https://kafka.apache.org/quickstart] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13914) Implement kafka-metadata-quorum.sh
[ https://issues.apache.org/jira/browse/KAFKA-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539136#comment-17539136 ] dengziming commented on KAFKA-13914: [~hachikuji] I'd love to take this after KIP-836.👌 > Implement kafka-metadata-quorum.sh > -- > > Key: KAFKA-13914 > URL: https://issues.apache.org/jira/browse/KAFKA-13914 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > KIP-595 documents a tool for describing quorum status > `kafka-metadata-quorum.sh`: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum#KIP595:ARaftProtocolfortheMetadataQuorum-ToolingSupport.] > We need to implement this. > Note that this depends on the Admin API for `DescribeQuorum`, which is > proposed in KIP-836: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-836%3A+Addition+of+Information+in+DescribeQuorumResponse+about+Voter+Lag.] > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13910) Test metadata refresh for Kraft admin client
dengziming created KAFKA-13910: -- Summary: Test metadata refresh for Kraft admin client Key: KAFKA-13910 URL: https://issues.apache.org/jira/browse/KAFKA-13910 Project: Kafka Issue Type: Test Reporter: dengziming [https://github.com/apache/kafka/pull/12110#discussion_r875418603] currently we don't get the real controller from MetadtaCache in KRaft mode, we should test it in another way -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13907) Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable
dengziming created KAFKA-13907: -- Summary: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable Key: KAFKA-13907 URL: https://issues.apache.org/jira/browse/KAFKA-13907 Project: Kafka Issue Type: Bug Reporter: dengziming ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable will hang up waiting for controlled shutdown, there may be some bug related to it. since this bug can be reproduced locally, it won't be hard to investigated. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-9837) New RPC for notifying controller of failed replica
[ https://issues.apache.org/jira/browse/KAFKA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536522#comment-17536522 ] dengziming commented on KAFKA-9837: --- [~soarez] Thank you for your proposals, I think this is what we are thinking about, let the controller recording which directory hosts each replica, we can avoid sending an O(n) request. But I think the name ASSIGN_REPLICAS_TO_FAIL_GROUP and FAIL_GROUP is not clear, how about ASSIGN_REPLICAS_TO_DIRECTORY, FAIL_DIRECTORY. btw, we have decided to mark KRaft as production soon in KIP-833, so this KIP will be useful and important. > New RPC for notifying controller of failed replica > -- > > Key: KAFKA-9837 > URL: https://issues.apache.org/jira/browse/KAFKA-9837 > Project: Kafka > Issue Type: New Feature > Components: controller, core >Reporter: David Arthur >Assignee: dengziming >Priority: Major > Labels: kip-500 > > This is the tracking ticket for > [KIP-589|https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller]. > For the bridge release, brokers should no longer use ZooKeeper to notify the > controller that a log dir has failed. It should instead use an RPC mechanism. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-13893) Add BrokerApiVersions Api in AdminClient
[ https://issues.apache.org/jira/browse/KAFKA-13893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13893. Resolution: Invalid > Add BrokerApiVersions Api in AdminClient > > > Key: KAFKA-13893 > URL: https://issues.apache.org/jira/browse/KAFKA-13893 > Project: Kafka > Issue Type: Task >Reporter: dengziming >Assignee: dengziming >Priority: Major > Labels: kip-required > > We already have a BrokerApiVersionsCommand to get broker api version, yet we > lack similar api in AdminClient. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13893) Add BrokerApiVersions Api in AdminClient
[ https://issues.apache.org/jira/browse/KAFKA-13893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534984#comment-17534984 ] dengziming commented on KAFKA-13893: [~mimaison] Thank you for pointing out this. > Add BrokerApiVersions Api in AdminClient > > > Key: KAFKA-13893 > URL: https://issues.apache.org/jira/browse/KAFKA-13893 > Project: Kafka > Issue Type: Task >Reporter: dengziming >Assignee: dengziming >Priority: Major > Labels: kip-required > > We already have a BrokerApiVersionsCommand to get broker api version, yet we > lack similar api in AdminClient. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13893) Add BrokerApiVersions Api in AdminClient
dengziming created KAFKA-13893: -- Summary: Add BrokerApiVersions Api in AdminClient Key: KAFKA-13893 URL: https://issues.apache.org/jira/browse/KAFKA-13893 Project: Kafka Issue Type: Task Reporter: dengziming Assignee: dengziming We already have a BrokerApiVersionsCommand to get broker api version, yet we lack similar api in AdminClient. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-13854) Refactor ApiVersion to MetadataVersion
[ https://issues.apache.org/jira/browse/KAFKA-13854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13854. Resolution: Fixed > Refactor ApiVersion to MetadataVersion > -- > > Key: KAFKA-13854 > URL: https://issues.apache.org/jira/browse/KAFKA-13854 > Project: Kafka > Issue Type: Sub-task >Reporter: David Arthur >Assignee: Alyssa Huang >Priority: Major > > In KRaft, we will have a value for {{metadata.version}} corresponding to each > IBP. In order to keep this association and make it obvious for developers, we > will consolidate the IBP and metadata version into a new MetadataVersion > enum. This new enum will replace the existing ApiVersion trait. > For IBPs that precede the first KRaft preview version (AK 3.0), we will use a > value of -1 for the metadata.version. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13863) Prevent null config value when create topic in KRaft mode
dengziming created KAFKA-13863: -- Summary: Prevent null config value when create topic in KRaft mode Key: KAFKA-13863 URL: https://issues.apache.org/jira/browse/KAFKA-13863 Project: Kafka Issue Type: Bug Reporter: dengziming -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-13862) Add And Subtract multiple config values is not supported in KRaft mode
[ https://issues.apache.org/jira/browse/KAFKA-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13862: -- Assignee: dengziming > Add And Subtract multiple config values is not supported in KRaft mode > -- > > Key: KAFKA-13862 > URL: https://issues.apache.org/jira/browse/KAFKA-13862 > Project: Kafka > Issue Type: Bug >Reporter: dengziming >Assignee: dengziming >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13862) Add And Subtract multiple config values is not supported in KRaft mode
dengziming created KAFKA-13862: -- Summary: Add And Subtract multiple config values is not supported in KRaft mode Key: KAFKA-13862 URL: https://issues.apache.org/jira/browse/KAFKA-13862 Project: Kafka Issue Type: Bug Reporter: dengziming -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-13859) SCRAM authentication issues with kafka-clients 3.0.1
[ https://issues.apache.org/jira/browse/KAFKA-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13859. Resolution: Not A Problem In [KIP-679]([https://cwiki.apache.org/confluence/display/KAFKA/KIP-679%3A+Producer+will+enable+the+strongest+delivery+guarantee+by+default#KIP679:Producerwillenablethestrongestdeliveryguaranteebydefault-%60IDEMPOTENT_WRITE%60Deprecation)] We are relaxing the ACL restriction from {{IDEMPOTENT_WRITE}} to {{WRITE}} earlier (release version 2.8) and changing the producer defaults later (release version 3.0) in order to give the community users enough time to upgrade their broker first. So their later client-side upgrading, which enables idempotence by default, won't get blocked by the {{IDEMPOTENT_WRITE}} ACL required by the old version brokers. so this is designed intentionally, we should help the users to make this change. > SCRAM authentication issues with kafka-clients 3.0.1 > > > Key: KAFKA-13859 > URL: https://issues.apache.org/jira/browse/KAFKA-13859 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 3.0.1 >Reporter: Oliver Payne >Assignee: dengziming >Priority: Major > > When attempting to produce records to Kafka using a client configured with > SCRAM authentication, the authentication is being rejected, and the following > exception is thrown: > {{org.apache.kafka.common.errors.ClusterAuthorizationException: Cluster > authorization failed.}} > I am seeing this happen with a Springboot service that was recently upgraded > to 2.6.5. After looking into this, I learned that Springboot moved to > kafka-clients 3.0.1 from 3.0.0 in that version. And sure enough, downgrading > to kafka-clients resolved the issue, with no changes made to the configs. > I have also attempted to connect to a separate server with kafka-clients > 3.0.1, using plaintext authentication. That works fine. So the issue appears > to be with SCRAM authentication. > I will note that I am attempting to connect to an AWS MSK instance. We use > SCRAM-SHA-512 as our sasl mechanism, using the basic {{ScramLoginModule.}} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-13859) SCRAM authentication issues with kafka-clients 3.0.1
[ https://issues.apache.org/jira/browse/KAFKA-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13859: -- Assignee: dengziming > SCRAM authentication issues with kafka-clients 3.0.1 > > > Key: KAFKA-13859 > URL: https://issues.apache.org/jira/browse/KAFKA-13859 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 3.0.1 >Reporter: Oliver Payne >Assignee: dengziming >Priority: Major > > When attempting to produce records to Kafka using a client configured with > SCRAM authentication, the authentication is being rejected, and the following > exception is thrown: > {{org.apache.kafka.common.errors.ClusterAuthorizationException: Cluster > authorization failed.}} > I am seeing this happen with a Springboot service that was recently upgraded > to 2.6.5. After looking into this, I learned that Springboot moved to > kafka-clients 3.0.1 from 3.0.0 in that version. And sure enough, downgrading > to kafka-clients resolved the issue, with no changes made to the configs. > I have also attempted to connect to a separate server with kafka-clients > 3.0.1, using plaintext authentication. That works fine. So the issue appears > to be with SCRAM authentication. > I will note that I am attempting to connect to an AWS MSK instance. We use > SCRAM-SHA-512 as our sasl mechanism, using the basic {{ScramLoginModule.}} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13859) SCRAM authentication issues with kafka-clients 3.0.1
[ https://issues.apache.org/jira/browse/KAFKA-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529204#comment-17529204 ] dengziming commented on KAFKA-13859: This maybe related to the fact that we enabled idempotence by default in 3.0.1, maybe you can explicitly set "enable.idempotence = false" in your ProducerConfig to solve this problem. [~opayne] > SCRAM authentication issues with kafka-clients 3.0.1 > > > Key: KAFKA-13859 > URL: https://issues.apache.org/jira/browse/KAFKA-13859 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 3.0.1 >Reporter: Oliver Payne >Priority: Major > > When attempting to produce records to Kafka using a client configured with > SCRAM authentication, the authentication is being rejected, and the following > exception is thrown: > {{org.apache.kafka.common.errors.ClusterAuthorizationException: Cluster > authorization failed.}} > I am seeing this happen with a Springboot service that was recently upgraded > to 2.6.5. After looking into this, I learned that Springboot moved to > kafka-clients 3.0.1 from 3.0.0 in that version. And sure enough, downgrading > to kafka-clients resolved the issue, with no changes made to the configs. > I have also attempted to connect to a separate server with kafka-clients > 3.0.1, using plaintext authentication. That works fine. So the issue appears > to be with SCRAM authentication. > I will note that I am attempting to connect to an AWS MSK instance. We use > SCRAM-SHA-512 as our sasl mechanism, using the basic {{ScramLoginModule.}} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13859) SCRAM authentication issues with kafka-clients 3.0.1
[ https://issues.apache.org/jira/browse/KAFKA-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529169#comment-17529169 ] dengziming commented on KAFKA-13859: [~opayne] Do you have more detailed exception stack trace, that would be useful? whether ClusterAuthorizationException is got from clients or broker log? > SCRAM authentication issues with kafka-clients 3.0.1 > > > Key: KAFKA-13859 > URL: https://issues.apache.org/jira/browse/KAFKA-13859 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 3.0.1 >Reporter: Oliver Payne >Priority: Major > > When attempting to produce records to Kafka using a client configured with > SCRAM authentication, the authentication is being rejected, and the following > exception is thrown: > {{org.apache.kafka.common.errors.ClusterAuthorizationException: Cluster > authorization failed.}} > I am seeing this happen with a Springboot service that was recently upgraded > to 2.6.5. After looking into this, I learned that Springboot moved to > kafka-clients 3.0.1 from 3.0.0 in that version. And sure enough, downgrading > to kafka-clients resolved the issue, with no changes made to the configs. > I have also attempted to connect to a separate server with kafka-clients > 3.0.1, using plaintext authentication. That works fine. So the issue appears > to be with SCRAM authentication. > I will note that I am attempting to connect to an AWS MSK instance. We use > SCRAM-SHA-512 as our sasl mechanism, using the basic {{ScramLoginModule.}} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-13850) kafka-metadata-shell is missing some record types
[ https://issues.apache.org/jira/browse/KAFKA-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13850: -- Assignee: dengziming > kafka-metadata-shell is missing some record types > - > > Key: KAFKA-13850 > URL: https://issues.apache.org/jira/browse/KAFKA-13850 > Project: Kafka > Issue Type: Bug > Components: kraft >Reporter: David Arthur >Assignee: dengziming >Priority: Major > > Noticed while working on feature flags in KRaft, the in-memory tree of the > metadata (MetadataNodeManager) is missing support for a few of record types. > * DelegationTokenRecord > * UserScramCredentialRecord (should we include this?) > * FeatureLevelRecord > * AccessControlEntryRecord > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13850) kafka-metadata-shell is missing some record types
[ https://issues.apache.org/jira/browse/KAFKA-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526746#comment-17526746 ] dengziming commented on KAFKA-13850: In fact I already found this problem, however, currently, DelegationTokenRecord and UserScramCredentialRecord are not used, FeatureLevelRecord will only be useful after KIP-778, so I'm waiting for KIP-778 to be done before I submitting this PR. > kafka-metadata-shell is missing some record types > - > > Key: KAFKA-13850 > URL: https://issues.apache.org/jira/browse/KAFKA-13850 > Project: Kafka > Issue Type: Bug > Components: kraft >Reporter: David Arthur >Priority: Major > > Noticed while working on feature flags in KRaft, the in-memory tree of the > metadata (MetadataNodeManager) is missing support for a few of record types. > * DelegationTokenRecord > * UserScramCredentialRecord (should we include this?) > * FeatureLevelRecord > * AccessControlEntryRecord > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-13620) The request handler metric name for ControllerApis should be different than KafkaApis
[ https://issues.apache.org/jira/browse/KAFKA-13620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13620: -- Assignee: dengziming > The request handler metric name for ControllerApis should be different than > KafkaApis > - > > Key: KAFKA-13620 > URL: https://issues.apache.org/jira/browse/KAFKA-13620 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > Labels: kip-500 > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-13845) Add support for reading KRaft snapshots in kafka-dump-log
[ https://issues.apache.org/jira/browse/KAFKA-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13845: -- Assignee: dengziming > Add support for reading KRaft snapshots in kafka-dump-log > - > > Key: KAFKA-13845 > URL: https://issues.apache.org/jira/browse/KAFKA-13845 > Project: Kafka > Issue Type: Sub-task >Reporter: David Arthur >Assignee: dengziming >Priority: Minor > Labels: kip-500 > > Even though the metadata snapshots use the same format as log segments, the > kafka-dump-log tool (DumpLogSegments.scala) does not support the file > extension or file name pattern. > For example, a metadata snapshot will be named like: > {code:java} > __cluster_metadata-0/0004-01.checkpoint{code} > whereas regular log segments (including the metadata log) are named like: > {code:java} > __cluster_metadata-0/.log {code} > > We need to enhance the tool to support snapshots. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13845) Add support for reading KRaft snapshots in kafka-dump-log
[ https://issues.apache.org/jira/browse/KAFKA-13845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526187#comment-17526187 ] dengziming commented on KAFKA-13845: One difference we should note is that metadata snapshot contains header and footer > Add support for reading KRaft snapshots in kafka-dump-log > - > > Key: KAFKA-13845 > URL: https://issues.apache.org/jira/browse/KAFKA-13845 > Project: Kafka > Issue Type: Sub-task >Reporter: David Arthur >Priority: Minor > Labels: kip-500 > > Even though the metadata snapshots use the same format as log segments, the > kafka-dump-log tool (DumpLogSegments.scala) does not support the file > extension or file name pattern. > For example, a metadata snapshot will be named like: > {code:java} > __cluster_metadata-0/0004-01.checkpoint{code} > whereas regular log segments (including the metadata log) are named like: > {code:java} > __cluster_metadata-0/.log {code} > > We need to enhance the tool to support snapshots. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-13836) Improve KRaft broker heartbeat logic
[ https://issues.apache.org/jira/browse/KAFKA-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13836: -- Assignee: dengziming > Improve KRaft broker heartbeat logic > > > Key: KAFKA-13836 > URL: https://issues.apache.org/jira/browse/KAFKA-13836 > Project: Kafka > Issue Type: Improvement >Reporter: dengziming >Assignee: dengziming >Priority: Major > > # Don't advertise an offset to the controller until it has been published > # only unfence a broker when it has seen it's own registration -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13836) Improve KRaft broker heartbeat logic
dengziming created KAFKA-13836: -- Summary: Improve KRaft broker heartbeat logic Key: KAFKA-13836 URL: https://issues.apache.org/jira/browse/KAFKA-13836 Project: Kafka Issue Type: Improvement Reporter: dengziming # Don't advertise an offset to the controller until it has been published # only unfence a broker when it has seen it's own registration -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13833) Remove the "min_version_level" version from the finalized version range that is written to ZooKeeper
dengziming created KAFKA-13833: -- Summary: Remove the "min_version_level" version from the finalized version range that is written to ZooKeeper Key: KAFKA-13833 URL: https://issues.apache.org/jira/browse/KAFKA-13833 Project: Kafka Issue Type: Sub-task Reporter: dengziming Assignee: dengziming -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13832) Flaky test TopicCommandIntegrationTest.testAlterAssignment
dengziming created KAFKA-13832: -- Summary: Flaky test TopicCommandIntegrationTest.testAlterAssignment Key: KAFKA-13832 URL: https://issues.apache.org/jira/browse/KAFKA-13832 Project: Kafka Issue Type: Improvement Components: unit tests Reporter: dengziming -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-13242) KRaft Controller doesn't handle UpdateFeaturesRequest
[ https://issues.apache.org/jira/browse/KAFKA-13242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13242. Resolution: Fixed > KRaft Controller doesn't handle UpdateFeaturesRequest > - > > Key: KAFKA-13242 > URL: https://issues.apache.org/jira/browse/KAFKA-13242 > Project: Kafka > Issue Type: Sub-task >Reporter: dengziming >Assignee: dengziming >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13788) Creation of invalid dynamic config prevents further creation of valid configs
[ https://issues.apache.org/jira/browse/KAFKA-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13788: -- Assignee: dengziming > Creation of invalid dynamic config prevents further creation of valid configs > - > > Key: KAFKA-13788 > URL: https://issues.apache.org/jira/browse/KAFKA-13788 > Project: Kafka > Issue Type: Bug > Components: config >Affects Versions: 2.8.0 >Reporter: Prateek Agarwal >Assignee: dengziming >Priority: Minor > > Kafka currently allows creating an unknown dynamic config without any errors. > But it errors when next valid dynamic config gets created. > This can be seen locally in a cluster by creating a wrong config > {{log.cleaner.threadzz}} which was preventing creation of the valid config > later {{log.cleaner.threads}}. > {code} > # Invalid config 'log.cleaner.threadzz' gets created without issues > $ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter > --add-config log.cleaner.threadzz=2 --entity-type brokers --entity-default 2>1 > Completed updating default config for brokers in the cluster. > {code} > Now when a valid config is added, {{kafka-configs.sh}} errors out: > {code} > $ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter > --add-config log.cleaner.threads=2 --entity-type brokers --entity-default > All sensitive broker config entries must be specified for --alter, missing > entries: Set(log.cleaner.threadzz) > {code} > To fix this, one needs to first delete the incorrect config: > {code:java} > $ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter > --delete-config log.cleaner.threadzz --entity-type brokers --entity-default > {code} > But ideally, the invalid config should error out so that creation of the > valid config doesn't get prevented. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-13743) kraft controller should prevent topics with conflicting metrics names from being created
[ https://issues.apache.org/jira/browse/KAFKA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13743. Reviewer: Colin McCabe Resolution: Fixed > kraft controller should prevent topics with conflicting metrics names from > being created > > > Key: KAFKA-13743 > URL: https://issues.apache.org/jira/browse/KAFKA-13743 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > Labels: kip-500 > Fix For: 3.3.0 > > > The kraft controller should prevent topics with conflicting metrics names > from being created, like the zk code does. > Example: > {code} > [cmccabe@zeratul kafka1]$ ./bin/kafka-topics.sh --create --topic f.oo > --bootstrap-server localhost:9092 > > WARNING: Due to limitations in metric names, topics with a period ('.') or > underscore ('_') could collide. To avoid issues it is best to use either, but > not both. > Created topic f.oo. > > > [cmccabe@zeratul kafka1]$ ./bin/kafka-topics.sh --create --topic f_oo > --bootstrap-server localhost:9092 > WARNING: Due to limitations in metric names, topics with a period ('.') or > underscore ('_') could collide. To avoid issues it is best to use either, but > not both. > Error while executing topic command : Topic 'f_oo' collides with existing > topics: f.oo > [2022-03-15 09:48:49,563] ERROR > org.apache.kafka.common.errors.InvalidTopicException: Topic 'f_oo' collides > with existing topics: f.oo > (kafka.admin.TopicCommand$) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (KAFKA-13743) kraft controller should prevent topics with conflicting metrics names from being created
[ https://issues.apache.org/jira/browse/KAFKA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming updated KAFKA-13743: --- Fix Version/s: 3.3.0 > kraft controller should prevent topics with conflicting metrics names from > being created > > > Key: KAFKA-13743 > URL: https://issues.apache.org/jira/browse/KAFKA-13743 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > Labels: kip-500 > Fix For: 3.3.0 > > > The kraft controller should prevent topics with conflicting metrics names > from being created, like the zk code does. > Example: > {code} > [cmccabe@zeratul kafka1]$ ./bin/kafka-topics.sh --create --topic f.oo > --bootstrap-server localhost:9092 > > WARNING: Due to limitations in metric names, topics with a period ('.') or > underscore ('_') could collide. To avoid issues it is best to use either, but > not both. > Created topic f.oo. > > > [cmccabe@zeratul kafka1]$ ./bin/kafka-topics.sh --create --topic f_oo > --bootstrap-server localhost:9092 > WARNING: Due to limitations in metric names, topics with a period ('.') or > underscore ('_') could collide. To avoid issues it is best to use either, but > not both. > Error while executing topic command : Topic 'f_oo' collides with existing > topics: f.oo > [2022-03-15 09:48:49,563] ERROR > org.apache.kafka.common.errors.InvalidTopicException: Topic 'f_oo' collides > with existing topics: f.oo > (kafka.admin.TopicCommand$) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13788) Creation of invalid dynamic config prevents further creation of valid configs
[ https://issues.apache.org/jira/browse/KAFKA-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515716#comment-17515716 ] dengziming commented on KAFKA-13788: Also reproduced this in KRaft mode > Creation of invalid dynamic config prevents further creation of valid configs > - > > Key: KAFKA-13788 > URL: https://issues.apache.org/jira/browse/KAFKA-13788 > Project: Kafka > Issue Type: Bug > Components: config >Affects Versions: 2.8.0 >Reporter: Prateek Agarwal >Priority: Minor > > Kafka currently allows creating an unknown dynamic config without any errors. > But it errors when next valid dynamic config gets created. > This can be seen locally in a cluster by creating a wrong config > {{log.cleaner.threadzz}} which was preventing creation of the valid config > later {{log.cleaner.threads}}. > {code} > # Invalid config 'log.cleaner.threadzz' gets created without issues > $ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter > --add-config log.cleaner.threadzz=2 --entity-type brokers --entity-default 2>1 > Completed updating default config for brokers in the cluster. > {code} > Now when a valid config is added, {{kafka-configs.sh}} errors out: > {code} > $ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter > --add-config log.cleaner.threads=2 --entity-type brokers --entity-default > All sensitive broker config entries must be specified for --alter, missing > entries: Set(log.cleaner.threadzz) > {code} > To fix this, one needs to first delete the incorrect config: > {code:java} > $ ./bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter > --delete-config log.cleaner.threadzz --entity-type brokers --entity-default > {code} > But ideally, the invalid config should error out so that creation of the > valid config doesn't get prevented. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13737) Flaky kafka.admin.LeaderElectionCommandTest.testPreferredReplicaElection
[ https://issues.apache.org/jira/browse/KAFKA-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13737: -- Assignee: dengziming > Flaky kafka.admin.LeaderElectionCommandTest.testPreferredReplicaElection > > > Key: KAFKA-13737 > URL: https://issues.apache.org/jira/browse/KAFKA-13737 > Project: Kafka > Issue Type: Bug >Reporter: Guozhang Wang >Assignee: dengziming >Priority: Blocker > Fix For: 3.2.0 > > > Examples: > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-11895/1/tests > {code} > java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node > assignment. Call: describeTopics > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) > at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at > kafka.utils.TestUtils$.$anonfun$waitForLeaderToBecome$1(TestUtils.scala:1812) > at scala.util.Try$.apply(Try.scala:210) > at kafka.utils.TestUtils$.currentLeader$1(TestUtils.scala:1811) > at kafka.utils.TestUtils$.waitForLeaderToBecome(TestUtils.scala:1819) > at kafka.utils.TestUtils$.assertLeader(TestUtils.scala:1789) > at > kafka.admin.LeaderElectionCommandTest.testPreferredReplicaElection(LeaderElectionCommandTest.scala:172) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13743) kraft controller should prevent topics with conflicting metrics names from being created
[ https://issues.apache.org/jira/browse/KAFKA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13743: -- Assignee: dengziming > kraft controller should prevent topics with conflicting metrics names from > being created > > > Key: KAFKA-13743 > URL: https://issues.apache.org/jira/browse/KAFKA-13743 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > Labels: kip-500 > > The kraft controller should prevent topics with conflicting metrics names > from being created, like the zk code does. > Example: > {code} > [cmccabe@zeratul kafka1]$ ./bin/kafka-topics.sh --create --topic f.oo > --bootstrap-server localhost:9092 > > WARNING: Due to limitations in metric names, topics with a period ('.') or > underscore ('_') could collide. To avoid issues it is best to use either, but > not both. > Created topic f.oo. > > > [cmccabe@zeratul kafka1]$ ./bin/kafka-topics.sh --create --topic f_oo > --bootstrap-server localhost:9092 > WARNING: Due to limitations in metric names, topics with a period ('.') or > underscore ('_') could collide. To avoid issues it is best to use either, but > not both. > Error while executing topic command : Topic 'f_oo' collides with existing > topics: f.oo > [2022-03-15 09:48:49,563] ERROR > org.apache.kafka.common.errors.InvalidTopicException: Topic 'f_oo' collides > with existing topics: f.oo > (kafka.admin.TopicCommand$) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13749) TopicConfigs and ErrorCode are not set in createTopics response in KRaft
[ https://issues.apache.org/jira/browse/KAFKA-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507974#comment-17507974 ] dengziming commented on KAFKA-13749: Thank you for reporting this, can you provided more details on the bug or show an example? > TopicConfigs and ErrorCode are not set in createTopics response in KRaft > > > Key: KAFKA-13749 > URL: https://issues.apache.org/jira/browse/KAFKA-13749 > Project: Kafka > Issue Type: Bug > Components: kraft >Reporter: Akhilesh Chaganti >Assignee: Akhilesh Chaganti >Priority: Major > > Once the createTopics request is process in KRaft, the `CreatableTopicResult` > is not set with the appropriate topic configs and error and this breaks > KIP-525 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-13733) Reset consumer group offset with not exist topic throw wrong exception
[ https://issues.apache.org/jira/browse/KAFKA-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13733. Resolution: Not A Problem > Reset consumer group offset with not exist topic throw wrong exception > -- > > Key: KAFKA-13733 > URL: https://issues.apache.org/jira/browse/KAFKA-13733 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Yujie Li >Priority: Major > > Hey, > I'm seen a bug with misleading exception when I try to reset consumer group > offset with not exist topic by: > `kafka-consumer-groups --bootstrap-server $brokers --reset-offsets --group > --topic :0 --to-offset ` > And got: > > ``` > Error: Executing consumer group command failed due to > org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node > assignment. Call: metadatajava.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node > assignment. Call: metadata > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.getLogStartOffsets(ConsumerGroupCommand.scala:654) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.checkOffsetsRange(ConsumerGroupCommand.scala:888) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.prepareOffsetsToReset(ConsumerGroupCommand.scala:796) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.$anonfun$resetOffsets$1(ConsumerGroupCommand.scala:437) > at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:646) > at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:642) > at scala.collection.AbstractIterable.foldLeft(Iterable.scala:919) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:432) > at kafka.admin.ConsumerGroupCommand$.run(ConsumerGroupCommand.scala:76) > at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:59) > at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala) > Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting > for a node assignment. Call: metadata > I think it should throw TopicNotExistException/NotAuthorizeException instead. > ``` > Thoughts: we should add topic permission / exist check in reset offset call > https://github.com/apache/kafka/blob/3.1/core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala#L421 > > Let me know what do you think! > > Thanks, > Yujie -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13733) Reset consumer group offset with not exist topic throw wrong exception
[ https://issues.apache.org/jira/browse/KAFKA-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507346#comment-17507346 ] dengziming commented on KAFKA-13733: I don't think this is a bug, In fact UnknownTopicOrPartitionException is a retriable error so the clients will retry infinitely until Timeout, so you will receive a TimeoutExecption when timeout. Maybe you should inspect your log4j properties so you can see the info log when retrying. > Reset consumer group offset with not exist topic throw wrong exception > -- > > Key: KAFKA-13733 > URL: https://issues.apache.org/jira/browse/KAFKA-13733 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Yujie Li >Priority: Major > > Hey, > I'm seen a bug with misleading exception when I try to reset consumer group > offset with not exist topic by: > `kafka-consumer-groups --bootstrap-server $brokers --reset-offsets --group > --topic :0 --to-offset ` > And got: > > ``` > Error: Executing consumer group command failed due to > org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node > assignment. Call: metadatajava.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node > assignment. Call: metadata > at > java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at > java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) > at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.getLogStartOffsets(ConsumerGroupCommand.scala:654) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.checkOffsetsRange(ConsumerGroupCommand.scala:888) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.prepareOffsetsToReset(ConsumerGroupCommand.scala:796) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.$anonfun$resetOffsets$1(ConsumerGroupCommand.scala:437) > at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:646) > at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:642) > at scala.collection.AbstractIterable.foldLeft(Iterable.scala:919) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:432) > at kafka.admin.ConsumerGroupCommand$.run(ConsumerGroupCommand.scala:76) > at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:59) > at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala) > Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting > for a node assignment. Call: metadata > I think it should throw TopicNotExistException/NotAuthorizeException instead. > ``` > Thoughts: we should add topic permission / exist check in reset offset call > https://github.com/apache/kafka/blob/3.1/core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala#L421 > > Let me know what do you think! > > Thanks, > Yujie -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13667) add a constraints in listeners in combined mode
[ https://issues.apache.org/jira/browse/KAFKA-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13667: -- Assignee: dengziming > add a constraints in listeners in combined mode > --- > > Key: KAFKA-13667 > URL: https://issues.apache.org/jira/browse/KAFKA-13667 > Project: Kafka > Issue Type: Improvement > Components: kraft >Affects Versions: 3.1.0 >Reporter: Luke Chen >Assignee: dengziming >Priority: Major > > While updating the description in example properties file for kraft mode, we > think we should add a constraint for "listeners" for combined mode. Both > broker listener and controller listener should be explicitly set, to avoid > confusion. > Please check this discussion: > [https://github.com/apache/kafka/pull/11616#discussion_r806399392] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (KAFKA-13667) add a constraints in listeners in combined mode
[ https://issues.apache.org/jira/browse/KAFKA-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17503248#comment-17503248 ] dengziming edited comment on KAFKA-13667 at 3/9/22, 2:35 AM: - we can't only list controller listener for `listeners` in combined mode because `advertised.listeners` must be a subset of `listeners` and advertised listener can not be empty. so we must always set both controller and broker listeners. was (Author: dengziming): we can't only list controller listener here in combined mode because `advertised.listeners` must be a subset of `listeners` and advertised listener can not be empty. so we must always set both controller and broker listeners here. > add a constraints in listeners in combined mode > --- > > Key: KAFKA-13667 > URL: https://issues.apache.org/jira/browse/KAFKA-13667 > Project: Kafka > Issue Type: Improvement > Components: kraft >Affects Versions: 3.1.0 >Reporter: Luke Chen >Priority: Major > > While updating the description in example properties file for kraft mode, we > think we should add a constraint for "listeners" for combined mode. Both > broker listener and controller listener should be explicitly set, to avoid > confusion. > Please check this discussion: > [https://github.com/apache/kafka/pull/11616#discussion_r806399392] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13667) add a constraints in listeners in combined mode
[ https://issues.apache.org/jira/browse/KAFKA-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17503248#comment-17503248 ] dengziming commented on KAFKA-13667: we can't only list controller listener here in combined mode because `advertised.listeners` must be a subset of `listeners` and advertised listener can not be empty. so we must always set both controller and broker listeners here. > add a constraints in listeners in combined mode > --- > > Key: KAFKA-13667 > URL: https://issues.apache.org/jira/browse/KAFKA-13667 > Project: Kafka > Issue Type: Improvement > Components: kraft >Affects Versions: 3.1.0 >Reporter: Luke Chen >Priority: Major > > While updating the description in example properties file for kraft mode, we > think we should add a constraint for "listeners" for combined mode. Both > broker listener and controller listener should be explicitly set, to avoid > confusion. > Please check this discussion: > [https://github.com/apache/kafka/pull/11616#discussion_r806399392] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13620) The request handler metric name for ControllerApis should be different than KafkaApis
[ https://issues.apache.org/jira/browse/KAFKA-13620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17492988#comment-17492988 ] dengziming commented on KAFKA-13620: [~Kvicii] Hi, this means the metric name rather than function name, you can see it in ControllerServer > The request handler metric name for ControllerApis should be different than > KafkaApis > - > > Key: KAFKA-13620 > URL: https://issues.apache.org/jira/browse/KAFKA-13620 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Priority: Major > Labels: kip-500 > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-8872) Improvements to controller "deleting" state / topic Identifiers
[ https://issues.apache.org/jira/browse/KAFKA-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490902#comment-17490902 ] dengziming commented on KAFKA-8872: --- I don't think this can be done with 3.2.0 since this KIP involves much work and some of them are still not started. > Improvements to controller "deleting" state / topic Identifiers > > > Key: KAFKA-8872 > URL: https://issues.apache.org/jira/browse/KAFKA-8872 > Project: Kafka > Issue Type: Improvement >Reporter: Lucas Bradstreet >Assignee: Justine Olshan >Priority: Major > > Kafka currently uniquely identifies a topic by its name. This is generally > sufficient, but there are flaws in this scheme if a topic is deleted and > recreated with the same name. As a result, Kafka attempts to prevent these > classes of issues by ensuring a topic is deleted from all replicas before > completing a deletion. This solution is not perfect, as it is possible for > partitions to be reassigned from brokers while they are down, and there are > no guarantees that this state will ever be cleaned up and will not cause > issues in the future. > As the controller must wait for all replicas to delete their local > partitions, deletes can also become blocked, preventing topics from being > created with the same name until the deletion is complete on all replicas. > This can mean that downtime for a single broker can effectively cause a > complete outage for everyone producing/consuming to that topic name, as the > topic cannot be recreated without manual intervention. > Unique topic IDs could help address this issue by associating a unique ID > with each topic, ensuring a newly created topic with a previously used name > cannot be confused with a previous topic with that name. > > KIP-516: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-12251) Add topic ID support to StopReplica
[ https://issues.apache.org/jira/browse/KAFKA-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-12251. Resolution: Won't Do > Add topic ID support to StopReplica > --- > > Key: KAFKA-12251 > URL: https://issues.apache.org/jira/browse/KAFKA-12251 > Project: Kafka > Issue Type: Sub-task >Reporter: dengziming >Assignee: dengziming >Priority: Major > > Remove topic name and Add topic id to StopReplicaReq and StopReplicaResp -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-9837) New RPC for notifying controller of failed replica
[ https://issues.apache.org/jira/browse/KAFKA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487938#comment-17487938 ] dengziming commented on KAFKA-9837: --- Thank you [~cmccabe] , your concern make sense to me. it seems o(num_partitions_in_dir) RPCs is inevitable here since the controller will write o(n) `PartitionChangeRecord` in Kraft metadata topic and send to raft observers(in zk mode, controller will send o(n) `UpdateMetadataRequest` and `StopReplicaRequest`). > New RPC for notifying controller of failed replica > -- > > Key: KAFKA-9837 > URL: https://issues.apache.org/jira/browse/KAFKA-9837 > Project: Kafka > Issue Type: New Feature > Components: controller, core >Reporter: David Arthur >Assignee: dengziming >Priority: Major > Labels: kip-500 > Fix For: 3.2.0 > > > This is the tracking ticket for > [KIP-589|https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller]. > For the bridge release, brokers should no longer use ZooKeeper to notify the > controller that a log dir has failed. It should instead use an RPC mechanism. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13637) User default.api.timeout.ms config as default timeout for KafkaConsumer.endOffsets
dengziming created KAFKA-13637: -- Summary: User default.api.timeout.ms config as default timeout for KafkaConsumer.endOffsets Key: KAFKA-13637 URL: https://issues.apache.org/jira/browse/KAFKA-13637 Project: Kafka Issue Type: Improvement Reporter: dengziming Assignee: dengziming In KafkaConsumer, we use `request.timeout.ms` in `endOffsets` and `default.api.timeout.ms` when in `beginningOffsets`, we should use `default.api.timeout.ms` for both. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13609) Fix the exception type thrown from dynamic broker config validation
[ https://issues.apache.org/jira/browse/KAFKA-13609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480320#comment-17480320 ] dengziming commented on KAFKA-13609: Hi [~cmccabe] , I am already doing since I found that we throw different Exception in KRaft and Zk mode, so I would like to pick this up. > Fix the exception type thrown from dynamic broker config validation > --- > > Key: KAFKA-13609 > URL: https://issues.apache.org/jira/browse/KAFKA-13609 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > > Currently, we throw InvalidRequestException if the broker configuration fails > validation. However, it would be more appropriate to throw > InvalidConfigurationException in this case. The configuration is still > well-formed, but just can't be applied. > This change might require a KIP since it has compatibility implications. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13609) Fix the exception type thrown from dynamic broker config validation
[ https://issues.apache.org/jira/browse/KAFKA-13609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13609: -- Assignee: dengziming > Fix the exception type thrown from dynamic broker config validation > --- > > Key: KAFKA-13609 > URL: https://issues.apache.org/jira/browse/KAFKA-13609 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > > Currently, we throw InvalidRequestException if the broker configuration fails > validation. However, it would be more appropriate to throw > InvalidConfigurationException in this case. The configuration is still > well-formed, but just can't be applied. > This change might require a KIP since it has compatibility implications. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (KAFKA-13242) KRaft Controller doesn't handle UpdateFeaturesRequest
[ https://issues.apache.org/jira/browse/KAFKA-13242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming updated KAFKA-13242: --- Parent: KAFKA-13410 Issue Type: Sub-task (was: Bug) > KRaft Controller doesn't handle UpdateFeaturesRequest > - > > Key: KAFKA-13242 > URL: https://issues.apache.org/jira/browse/KAFKA-13242 > Project: Kafka > Issue Type: Sub-task >Reporter: dengziming >Assignee: dengziming >Priority: Major > Fix For: 3.0.1 > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-13528) KRaft RegisterBroker should validate that the cluster ID matches
[ https://issues.apache.org/jira/browse/KAFKA-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13528. Resolution: Fixed > KRaft RegisterBroker should validate that the cluster ID matches > > > Key: KAFKA-13528 > URL: https://issues.apache.org/jira/browse/KAFKA-13528 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13503) Validate broker configs for KRaft
[ https://issues.apache.org/jira/browse/KAFKA-13503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13503: -- Assignee: dengziming > Validate broker configs for KRaft > - > > Key: KAFKA-13503 > URL: https://issues.apache.org/jira/browse/KAFKA-13503 > Project: Kafka > Issue Type: Improvement >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > Labels: kip-500 > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-12616) Convert integration tests to use ClusterTest
[ https://issues.apache.org/jira/browse/KAFKA-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17470362#comment-17470362 ] dengziming commented on KAFKA-12616: It seems we can use QuorumTestHarness to test both zk and Kraft, so we can remove ClusterTest? > Convert integration tests to use ClusterTest > - > > Key: KAFKA-12616 > URL: https://issues.apache.org/jira/browse/KAFKA-12616 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Priority: Major > Labels: kip-500 > Fix For: 3.2.0 > > > We would like to convert integration tests to use the new ClusterTest > annotations so that we can easily test both the Zk and KRaft implementations. > This will require adding a bunch of support to the ClusterTest framework as > we go along. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13570) Fallback of unsupported versions of ApiVersionsRequest and RequestHeader?
[ https://issues.apache.org/jira/browse/KAFKA-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468965#comment-17468965 ] dengziming commented on KAFKA-13570: I don't think it's a simple problem for a Server to properly handle a higher version RPC from a Client. So it's the responsibility of client to do it, most Kafka client will fallback to older version RPC on an UnsupportedVersionException, but it's not granted. > Fallback of unsupported versions of ApiVersionsRequest and RequestHeader? > - > > Key: KAFKA-13570 > URL: https://issues.apache.org/jira/browse/KAFKA-13570 > Project: Kafka > Issue Type: Improvement >Reporter: Fredrik Arvidsson >Priority: Minor > > I've gone through the protocol documentation and the source code, but I can't > find any explicit documentation stating how clients and brokers are handling > the scenario when the client sends higher versions of ApiVersionsRequest and > RequestHeader which the broker doesn't understand. I've seen hints in > discussions that the broker would fallback to returning version 0 of the > ApiVersionsResponse in this scenario. > Are there any documentation that explains these scenarios? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13569) Is changing a message / field name a breaking change?
[ https://issues.apache.org/jira/browse/KAFKA-13569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17466288#comment-17466288 ] dengziming commented on KAFKA-13569: Currently the json schema is indeed not public api, A KIP will be required before we make it so, so w'll try to keep them unchanged but don't ensure that. > Is changing a message / field name a breaking change? > - > > Key: KAFKA-13569 > URL: https://issues.apache.org/jira/browse/KAFKA-13569 > Project: Kafka > Issue Type: Improvement >Reporter: Fredrik Arvidsson >Priority: Minor > > This > [commit|[https://github.com/ijuma/kafka/commit/89ee566e466cd58023c6a4cac64893bfcf38b570#diff-fac7080d67da905a80126d58fc1745c9a1409de7ef7d093c2ac66a888b134633]] > changed the name of ListOffsetRequest and ListOffsetResponse to > ListOffsetsRequest and ListOffsetsResponse. It was merged from this > [PR|https://github.com/apache/kafka/pull/9748] stating it is a MINOR change. > Are changes to message names (and perhaps field names as well) not considered > a breaking change for the protocol schemas? > I do realize it doesn't change how it is sent over the wire, but it does > effect code generators. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13552) Unable to dynamically change broker log levels on KRaft
[ https://issues.apache.org/jira/browse/KAFKA-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461137#comment-17461137 ] dengziming commented on KAFKA-13552: hello [~rndgstn] , It seems this is a duplicated of KAFKA-13502. > Unable to dynamically change broker log levels on KRaft > --- > > Key: KAFKA-13552 > URL: https://issues.apache.org/jira/browse/KAFKA-13552 > Project: Kafka > Issue Type: Bug > Components: kraft >Affects Versions: 3.1.0, 3.0.0 >Reporter: Ron Dagostino >Priority: Major > > It is currently not possible to dynamically change the log level in KRaft. > For example: > kafka-configs.sh --bootstrap-server --alter --add-config > "kafka.server.ReplicaManager=DEBUG" --entity-type broker-loggers > --entity-name 0 > Results in: > org.apache.kafka.common.errors.InvalidRequestException: Unexpected resource > type BROKER_LOGGER. > The code to process this request is in ZkAdminManager.alterLogLevelConfigs(). > This needs to be moved out of there, and the functionality has to be > processed locally on the broker instead of being forwarded to the KRaft > controller. > It is also an open question as to how we can dynamically alter log levels for > a remote KRaft controller. Connecting directly to it is one possible > solution, but that may not be desirable since generally connecting directly > to the controller is not necessary. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13509) Support max timestamp in GetOffsetShell
dengziming created KAFKA-13509: -- Summary: Support max timestamp in GetOffsetShell Key: KAFKA-13509 URL: https://issues.apache.org/jira/browse/KAFKA-13509 Project: Kafka Issue Type: Sub-task Components: tools Reporter: dengziming Assignee: dengziming We would list offset with max timestamp using `kafka.tools.GetOffsetShell` : ``` bin/kafka-run-class.sh kafka.tools.GetOffsetShell --bootstrap-server localhost:9092 --topic topic1 --time -3 ``` -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13467) Clients never refresh cached bootstrap IPs
[ https://issues.apache.org/jira/browse/KAFKA-13467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446746#comment-17446746 ] dengziming commented on KAFKA-13467: Since clients have no visibility to changes in bootstrap IPs, it seems that the only mechanism we have for updating the cached IPs is connection establishment. By closing the broker side of the connection, clients would be forced to reconnect and receive an updated set of bootstrap IPs. what do you think [~mjsax] ? should we investigate an alternative approache in a new KIP? > Clients never refresh cached bootstrap IPs > -- > > Key: KAFKA-13467 > URL: https://issues.apache.org/jira/browse/KAFKA-13467 > Project: Kafka > Issue Type: Improvement > Components: clients, network >Reporter: Matthias J. Sax >Priority: Minor > > Follow up ticket to https://issues.apache.org/jira/browse/KAFKA-13405. > For certain broker rolling upgrade scenarios, it would be beneficial to > expired cached bootstrap server IP addresses and re-resolve those IPs to > allow clients to re-connect to the cluster without the need to restart the > client. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (KAFKA-13462) KRaft server does not return internal topics on list topics RPC
[ https://issues.apache.org/jira/browse/KAFKA-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13462. Resolution: Invalid __consumer_offsets will not be created unless we store commit offset > KRaft server does not return internal topics on list topics RPC > --- > > Key: KAFKA-13462 > URL: https://issues.apache.org/jira/browse/KAFKA-13462 > Project: Kafka > Issue Type: Bug >Reporter: dengziming >Assignee: dengziming >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13462) KRaft server does not return internal topics on list topics RPC
[ https://issues.apache.org/jira/browse/KAFKA-13462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13462: -- Assignee: dengziming > KRaft server does not return internal topics on list topics RPC > --- > > Key: KAFKA-13462 > URL: https://issues.apache.org/jira/browse/KAFKA-13462 > Project: Kafka > Issue Type: Bug >Reporter: dengziming >Assignee: dengziming >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13462) KRaft server does not return internal topics on list topics RPC
dengziming created KAFKA-13462: -- Summary: KRaft server does not return internal topics on list topics RPC Key: KAFKA-13462 URL: https://issues.apache.org/jira/browse/KAFKA-13462 Project: Kafka Issue Type: Bug Reporter: dengziming -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13242) KRaft Controller doesn't handle UpdateFeaturesRequest
[ https://issues.apache.org/jira/browse/KAFKA-13242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13242: -- Assignee: dengziming > KRaft Controller doesn't handle UpdateFeaturesRequest > - > > Key: KAFKA-13242 > URL: https://issues.apache.org/jira/browse/KAFKA-13242 > Project: Kafka > Issue Type: Bug >Reporter: dengziming >Assignee: dengziming >Priority: Major > Fix For: 3.0.1 > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13316) Convert CreateTopicsRequestWithPolicyTest to use ClusterTest
dengziming created KAFKA-13316: -- Summary: Convert CreateTopicsRequestWithPolicyTest to use ClusterTest Key: KAFKA-13316 URL: https://issues.apache.org/jira/browse/KAFKA-13316 Project: Kafka Issue Type: Sub-task Reporter: dengziming Follows up for KAFKA-13279 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-13242) KRaft Controller doesn't handle UpdateFeaturesRequest
[ https://issues.apache.org/jira/browse/KAFKA-13242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming updated KAFKA-13242: --- Fix Version/s: 3.0.1 > KRaft Controller doesn't handle UpdateFeaturesRequest > - > > Key: KAFKA-13242 > URL: https://issues.apache.org/jira/browse/KAFKA-13242 > Project: Kafka > Issue Type: Bug >Reporter: dengziming >Priority: Major > Fix For: 3.0.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-13242) KRaft Controller doesn't handle UpdateFeaturesRequest
dengziming created KAFKA-13242: -- Summary: KRaft Controller doesn't handle UpdateFeaturesRequest Key: KAFKA-13242 URL: https://issues.apache.org/jira/browse/KAFKA-13242 Project: Kafka Issue Type: Bug Reporter: dengziming -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-13228) ApiVersionRequest are not correctly handled in kraft mode
dengziming created KAFKA-13228: -- Summary: ApiVersionRequest are not correctly handled in kraft mode Key: KAFKA-13228 URL: https://issues.apache.org/jira/browse/KAFKA-13228 Project: Kafka Issue Type: Bug Reporter: dengziming Assignee: dengziming Fix For: 3.0.1 I'am trying to describe quorum in kraft mode but got `org.apache.kafka.common.errors.UnsupportedVersionException: The broker does not support DESCRIBE_QUORUM`. This happens because we only concerns `ApiKeys.zkBrokerApis()` when we call `NodeApiVersions.create()` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-12701) NPE in MetadataRequest when using topic IDs
[ https://issues.apache.org/jira/browse/KAFKA-12701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-12701. Assignee: Justine Olshan (was: dengziming) Resolution: Fixed > NPE in MetadataRequest when using topic IDs > --- > > Key: KAFKA-12701 > URL: https://issues.apache.org/jira/browse/KAFKA-12701 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Travis Bischel >Assignee: Justine Olshan >Priority: Major > > Authorized result checking relies on topic name to not be null, which, when > using topic IDs, it is. > Unlike the logic in handleDeleteTopicsRequest, handleMetadataRequest does not > check zk for the names corresponding to topic IDs if topic IDs are present. > {noformat} > [2021-04-21 05:53:01,463] ERROR [KafkaApi-1] Error when handling request: > clientId=kgo, correlationId=1, api=METADATA, version=11, > body=MetadataRequestData(topics=[MetadataRequestTopic(topicId=LmqOoFOASnqQp_4-oJgeKA, > name=null)], allowAutoTopicCreation=false, > includeClusterAuthorizedOperations=false, > includeTopicAuthorizedOperations=false) (kafka.server.RequestHandlerHelper) > java.lang.NullPointerException: name > at java.base/java.util.Objects.requireNonNull(Unknown Source) > at > org.apache.kafka.common.resource.ResourcePattern.(ResourcePattern.java:50) > at > kafka.server.AuthHelper.$anonfun$filterByAuthorized$3(AuthHelper.scala:121) > at scala.collection.Iterator$$anon$9.next(Iterator.scala:575) > at scala.collection.mutable.Growable.addAll(Growable.scala:62) > at scala.collection.mutable.Growable.addAll$(Growable.scala:57) > at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:142) > at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:42) > at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:258) > at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:247) > at scala.collection.SeqFactory$Delegate.from(Factory.scala:306) > at scala.collection.IterableOnceOps.toBuffer(IterableOnce.scala:1270) > at scala.collection.IterableOnceOps.toBuffer$(IterableOnce.scala:1270) > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1288) > at kafka.server.AuthHelper.filterByAuthorized(AuthHelper.scala:120) > at > kafka.server.KafkaApis.handleTopicMetadataRequest(KafkaApis.scala:1146) > at kafka.server.KafkaApis.handle(KafkaApis.scala:170) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:74) > at java.base/java.lang.Thread.run(Unknown Source) > [2021-04-21 05:53:01,464] ERROR [Kafka Request Handler 1 on Broker 1], > Exception when handling request (kafka.server.KafkaRequestHandler) > java.lang.NullPointerException > at > org.apache.kafka.common.message.MetadataResponseData$MetadataResponseTopic.addSize(MetadataResponseData.java:1247) > at > org.apache.kafka.common.message.MetadataResponseData.addSize(MetadataResponseData.java:417) > at > org.apache.kafka.common.protocol.SendBuilder.buildSend(SendBuilder.java:218) > at > org.apache.kafka.common.protocol.SendBuilder.buildResponseSend(SendBuilder.java:200) > at > org.apache.kafka.common.requests.AbstractResponse.toSend(AbstractResponse.java:43) > at > org.apache.kafka.common.requests.RequestContext.buildResponseSend(RequestContext.java:111) > at > kafka.network.RequestChannel$Request.buildResponseSend(RequestChannel.scala:132) > at > kafka.server.RequestHandlerHelper.sendResponse(RequestHandlerHelper.scala:185) > at > kafka.server.RequestHandlerHelper.sendErrorOrCloseConnection(RequestHandlerHelper.scala:155) > at > kafka.server.RequestHandlerHelper.sendErrorResponseMaybeThrottle(RequestHandlerHelper.scala:109) > at > kafka.server.RequestHandlerHelper.handleError(RequestHandlerHelper.scala:79) > at kafka.server.KafkaApis.handle(KafkaApis.scala:229) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:74) > at java.base/java.lang.Thread.run(Unknown Source) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KAFKA-12908) Load snapshot heuristic
[ https://issues.apache.org/jira/browse/KAFKA-12908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398747#comment-17398747 ] dengziming edited comment on KAFKA-12908 at 8/13/21, 3:46 PM: -- It seems we can't write the number of records in a snapshot to the SnapshotHeader, instead, we can write num of records to SnapshotFooter, so we should read SnapshotFooter to decide whether to load a snapshot or read from the next offset. WDYT [~jagsancio] ? was (Author: dengziming): It seems we can't write the number of records in a snapshot to the SnapshotHeader, instead, we can write num of records to SnapshotFooter, so we should read SnapshotFooter to decide whether to load a snapshot or read from the next offset. WDYT [~jagsancio] > Load snapshot heuristic > --- > > Key: KAFKA-12908 > URL: https://issues.apache.org/jira/browse/KAFKA-12908 > Project: Kafka > Issue Type: Sub-task >Reporter: Jose Armando Garcia Sancio >Assignee: dengziming >Priority: Minor > > The {{KafkaRaftCient}} implementation only forces the {{RaftClient.Listener}} > to load a snapshot only when the listener's next offset is less than the > start offset. > This is technically correct but in some cases it may be more efficient to > load a snapshot even when the next offset exists in the log. This is clearly > true when the latest snapshot has less entries than the number of records > from the next offset to the latest snapshot. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-12908) Load snapshot heuristic
[ https://issues.apache.org/jira/browse/KAFKA-12908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398747#comment-17398747 ] dengziming commented on KAFKA-12908: It seems we can't write the number of records in a snapshot to the SnapshotHeader, instead, we can write num of records to SnapshotFooter, so we should read SnapshotFooter to decide whether to load a snapshot or read from the next offset. WDYT [~jagsancio] > Load snapshot heuristic > --- > > Key: KAFKA-12908 > URL: https://issues.apache.org/jira/browse/KAFKA-12908 > Project: Kafka > Issue Type: Sub-task >Reporter: Jose Armando Garcia Sancio >Assignee: dengziming >Priority: Minor > > The {{KafkaRaftCient}} implementation only forces the {{RaftClient.Listener}} > to load a snapshot only when the listener's next offset is less than the > start offset. > This is technically correct but in some cases it may be more efficient to > load a snapshot even when the next offset exists in the log. This is clearly > true when the latest snapshot has less entries than the number of records > from the next offset to the latest snapshot. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-12155) Delay increasing the log start offset
[ https://issues.apache.org/jira/browse/KAFKA-12155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-12155. Resolution: Fixed > Delay increasing the log start offset > - > > Key: KAFKA-12155 > URL: https://issues.apache.org/jira/browse/KAFKA-12155 > Project: Kafka > Issue Type: Sub-task > Components: replication >Reporter: Jose Armando Garcia Sancio >Assignee: David Arthur >Priority: Major > > The implementation in [https://github.com/apache/kafka/pull/9816] increases > the log start offset as soon as a snapshot is created that is greater than > the log start offset. This is correct but causes some inefficiency in some > cases. > # Any follower, voters or observers, with an end offset between the leader's > log start offset and the leader's latest snapshot will get invalidated. This > will cause those follower to fetch the new snapshot and reload it's state > machine. > # Any {{Listener}} or state machine that has a {{nextExpectedOffset()}} less > than the latest snapshot will get invalidated. This will cause the state > machine to have to reload its state from the latest snapshot. > To minimize the frequency of these reloads KIP-630 proposes adding the > following configuration: > * {{metadata.start.offset.lag.time.max.ms}} - The maximum amount of time > that leader will wait for an offset to get replicated to all of the live > replicas before advancing the {{LogStartOffset}}. See section “When to > Increase the LogStartOffset”. The default is 7 days. > This description and implementation should be extended to also apply to the > state machine, or {{Listener}}. The local log start offset should be > increased when all of the {{ListenerContext}}'s {{nextExpectedOffset()}} is > greater than the offset of the latest snapshot. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-12465) Decide whether inconsistent cluster id error are fatal
[ https://issues.apache.org/jira/browse/KAFKA-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-12465: -- Assignee: dengziming > Decide whether inconsistent cluster id error are fatal > -- > > Key: KAFKA-12465 > URL: https://issues.apache.org/jira/browse/KAFKA-12465 > Project: Kafka > Issue Type: Sub-task >Reporter: dengziming >Assignee: dengziming >Priority: Major > > Currently, we just log an error when an inconsistent cluster-id occurred. We > should set a window during startup when these errors are fatal but after that > window, we no longer treat them to be fatal. see > https://github.com/apache/kafka/pull/10289#discussion_r592853088 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-13148) Kraft Controller doesn't handle scheduleAppend returning Long.MAX_VALUE
[ https://issues.apache.org/jira/browse/KAFKA-13148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389282#comment-17389282 ] dengziming commented on KAFKA-13148: KAFKA-12158 will change the return value of `scheduleAppend` and throws an exception if append failed to let the `QuorumController` handle it, so we no longer need to handle Null and Long.MaxValue since `QuorumController` will resign leadership and on any error type. > Kraft Controller doesn't handle scheduleAppend returning Long.MAX_VALUE > --- > > Key: KAFKA-13148 > URL: https://issues.apache.org/jira/browse/KAFKA-13148 > Project: Kafka > Issue Type: Bug > Components: controller, kraft >Reporter: Jose Armando Garcia Sancio >Assignee: Niket Goel >Priority: Major > Labels: kip-500 > > In some cases the RaftClient will return Long.MAX_VALUE: > {code:java} > /** >* Append a list of records to the log. The write will be scheduled for > some time >* in the future. There is no guarantee that appended records will be > written to >* the log and eventually committed. However, it is guaranteed that if > any of the >* records become committed, then all of them will be. >* >* If the provided current leader epoch does not match the current > epoch, which >* is possible when the state machine has yet to observe the epoch > change, then >* this method will return {@link Long#MAX_VALUE} to indicate an offset > which is >* not possible to become committed. The state machine is expected to > discard all >* uncommitted entries after observing an epoch change. >* >* @param epoch the current leader epoch >* @param records the list of records to append >* @return the expected offset of the last record; {@link > Long#MAX_VALUE} if the records could >* be committed; null if no memory could be allocated for the > batch at this time >* @throws org.apache.kafka.common.errors.RecordBatchTooLargeException > if the size of the records is greater than the maximum >* batch size; if this exception is throw none of the elements > in records were >* committed >*/ > Long scheduleAtomicAppend(int epoch, List records); > {code} > The controller doesn't handle this case: > {code:java} > // If the operation returned a batch of records, those > records need to be > // written before we can return our result to the user. > Here, we hand off > // the batch of records to the raft client. They will be > written out > // asynchronously. > final long offset; > if (result.isAtomic()) { > offset = > raftClient.scheduleAtomicAppend(controllerEpoch, result.records()); > } else { > offset = raftClient.scheduleAppend(controllerEpoch, > result.records()); > } > op.processBatchEndOffset(offset); > writeOffset = offset; > resultAndOffset = ControllerResultAndOffset.of(offset, > result); > for (ApiMessageAndVersion message : result.records()) { > replay(message.message(), Optional.empty(), offset); > } > snapshotRegistry.getOrCreateSnapshot(offset); > log.debug("Read-write operation {} will be completed when > the log " + > "reaches offset {}.", this, resultAndOffset.offset()); > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-13042) Flaky test KafkaMetadataLogTest.testDeleteSnapshots()
[ https://issues.apache.org/jira/browse/KAFKA-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378574#comment-17378574 ] dengziming commented on KAFKA-13042: [~meher_crok] This is a bug of Scala 2.12 and fixed here: https://github.com/apache/kafka/pull/10997 > Flaky test KafkaMetadataLogTest.testDeleteSnapshots() > - > > Key: KAFKA-13042 > URL: https://issues.apache.org/jira/browse/KAFKA-13042 > Project: Kafka > Issue Type: Bug >Reporter: dengziming >Priority: Major > > This test fails many times but succeeds locally. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-13042) Flaky test KafkaMetadataLogTest.testDeleteSnapshots()
[ https://issues.apache.org/jira/browse/KAFKA-13042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming resolved KAFKA-13042. Resolution: Fixed > Flaky test KafkaMetadataLogTest.testDeleteSnapshots() > - > > Key: KAFKA-13042 > URL: https://issues.apache.org/jira/browse/KAFKA-13042 > Project: Kafka > Issue Type: Bug >Reporter: dengziming >Priority: Major > > This test fails many times but succeeds locally. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-13042) Flaky test KafkaMetadataLogTest.testDeleteSnapshots()
dengziming created KAFKA-13042: -- Summary: Flaky test KafkaMetadataLogTest.testDeleteSnapshots() Key: KAFKA-13042 URL: https://issues.apache.org/jira/browse/KAFKA-13042 Project: Kafka Issue Type: Bug Reporter: dengziming This test fails many times but succeeds locally. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-12660) Do not update offset commit sensor after append failure
[ https://issues.apache.org/jira/browse/KAFKA-12660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371766#comment-17371766 ] dengziming commented on KAFKA-12660: [~kirktrue] Thank you for your attention, but I have already opened a PR for this issue: [https://github.com/apache/kafka/pull/10560,] are you interested in reviewing it? > Do not update offset commit sensor after append failure > --- > > Key: KAFKA-12660 > URL: https://issues.apache.org/jira/browse/KAFKA-12660 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: dengziming >Priority: Major > > In the append callback after writing an offset to the log in > `GroupMetadataManager`, It seems wrong to update the offset commit sensor > prior to checking for errors: > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L394. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-13005) Support JBOD in kraft mode
[ https://issues.apache.org/jira/browse/KAFKA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371005#comment-17371005 ] dengziming commented on KAFKA-13005: This is currently blocked by KAFKA-9837, I will do this ;) > Support JBOD in kraft mode > -- > > Key: KAFKA-13005 > URL: https://issues.apache.org/jira/browse/KAFKA-13005 > Project: Kafka > Issue Type: Improvement >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > Labels: kip-500 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-13005) Support JBOD in kraft mode
[ https://issues.apache.org/jira/browse/KAFKA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-13005: -- Assignee: dengziming > Support JBOD in kraft mode > -- > > Key: KAFKA-13005 > URL: https://issues.apache.org/jira/browse/KAFKA-13005 > Project: Kafka > Issue Type: Improvement >Reporter: Colin McCabe >Assignee: dengziming >Priority: Major > Labels: kip-500 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-12158) Consider better return type of RaftClient.scheduleAppend
[ https://issues.apache.org/jira/browse/KAFKA-12158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dengziming reassigned KAFKA-12158: -- Assignee: dengziming > Consider better return type of RaftClient.scheduleAppend > > > Key: KAFKA-12158 > URL: https://issues.apache.org/jira/browse/KAFKA-12158 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: dengziming >Priority: Major > > Currently `RaftClient` has the following Append API: > {code} > Long scheduleAppend(int epoch, List records); > {code} > There are a few possible cases that the single return value is trying to > handle: > 1. The epoch doesn't match or we are not the current leader => return > Long.MaxValue > 2. We failed to allocate memory to write the the batch (backpressure case) => > return null > 3. We successfully scheduled the append => return the expected offset > It might be better to define a richer type so that the cases that must be > handled are clearer. At a minimum, it would be better to return > `OptionalLong` and get rid of the null case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KAFKA-12908) Load snapshot heuristic
[ https://issues.apache.org/jira/browse/KAFKA-12908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17366100#comment-17366100 ] dengziming edited comment on KAFKA-12908 at 6/20/21, 4:30 AM: -- This may depend on KAFKA-12155 was (Author: dengziming): This may be depend on KAFKA-12155 > Load snapshot heuristic > --- > > Key: KAFKA-12908 > URL: https://issues.apache.org/jira/browse/KAFKA-12908 > Project: Kafka > Issue Type: Sub-task >Reporter: Jose Armando Garcia Sancio >Assignee: dengziming >Priority: Minor > > The {{KafkaRaftCient}} implementation only forces the {{RaftClient.Listener}} > to load a snapshot only when the listener's next offset is less than the > start offset. > This is technically correct but in some cases it may be more efficient to > load a snapshot even when the next offset exists in the log. This is clearly > true when the latest snapshot has less entries than the number of records > from the next offset to the latest snapshot. -- This message was sent by Atlassian Jira (v8.3.4#803005)