[jira] [Created] (KAFKA-8164) Improve test passing rate by rerunning flaky tests
Viktor Somogyi-Vass created KAFKA-8164: -- Summary: Improve test passing rate by rerunning flaky tests Key: KAFKA-8164 URL: https://issues.apache.org/jira/browse/KAFKA-8164 Project: Kafka Issue Type: Improvement Affects Versions: 2.3.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Failing flaky tests are often a problem: * pull requests need to be rerun to achieve a green build * rerunning the whole build for a PR is resource and time consuming * green build give high confidence when releasing (and well, generally too) but flaky tests often erode this and annoying for the developers The aim of this JIRA is to provide an improvement which automatically retries any tests that are failed for the first run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7981) Add Replica Fetcher and Log Cleaner Count Metrics
Viktor Somogyi-Vass created KAFKA-7981: -- Summary: Add Replica Fetcher and Log Cleaner Count Metrics Key: KAFKA-7981 URL: https://issues.apache.org/jira/browse/KAFKA-7981 Project: Kafka Issue Type: Improvement Components: metrics Affects Versions: 2.3.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8566) Force Topic Deletion
Viktor Somogyi-Vass created KAFKA-8566: -- Summary: Force Topic Deletion Key: KAFKA-8566 URL: https://issues.apache.org/jira/browse/KAFKA-8566 Project: Kafka Issue Type: Improvement Affects Versions: 2.3.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Forcing topic deletion is sometimes useful if normal topic deletion gets stucked. In this case we want to remove the topic data from zookeeper anyway and also the segment files from the online brokers. On offline brokers it could be done on startup somehow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8330) Separate Replica and Reassignment Throttling
Viktor Somogyi-Vass created KAFKA-8330: -- Summary: Separate Replica and Reassignment Throttling Key: KAFKA-8330 URL: https://issues.apache.org/jira/browse/KAFKA-8330 Project: Kafka Issue Type: Improvement Components: core, replication Affects Versions: 2.3.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Currently Kafka doesn't separate reassignment related replication from ISR replication. That is dangerous because if replication is throttled during reassignment and some replicas fall out of ISR they could get further behind while reassignment is also slower. We need a method that throttles reassignment related replication only. A KIP is in progress for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-9167) Implement a broker to controller request channel
Viktor Somogyi-Vass created KAFKA-9167: -- Summary: Implement a broker to controller request channel Key: KAFKA-9167 URL: https://issues.apache.org/jira/browse/KAFKA-9167 Project: Kafka Issue Type: Sub-task Components: controller, core Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass In some cases, we will need to create a new API to replace an operation that was formerly done via ZooKeeper. One example of this is that when the leader of a partition wants to modify the in-sync replica set, it currently modifies ZooKeeper directly In the post-ZK world, the leader will make an RPC to the active controller instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9166) Implement MetadataFetch API
Viktor Somogyi-Vass created KAFKA-9166: -- Summary: Implement MetadataFetch API Key: KAFKA-9166 URL: https://issues.apache.org/jira/browse/KAFKA-9166 Project: Kafka Issue Type: Sub-task Components: controller Reporter: Viktor Somogyi-Vass Instead of the controller pushing out updates to the other brokers, those brokers will fetch updates from the active controller via the new MetadataFetch API. A MetadataFetch is similar to a fetch request. Just like with a fetch request, the broker will track the offset of the last updates it fetched, and only request newer updates from the active controller. The broker will persist the metadata it fetched to disk. This will allow the broker to start up very quickly, even if there are hundreds of thousands or even millions of partitions. (Note that since this persistence is an optimization, we can leave it out of the first version, if it makes development easier.) Most of the time, the broker should only need to fetch the deltas, not the full state. However, if the broker is too far behind the active controller, or if the broker has no cached metadata at all, the controller will send a full metadata image rather than a series of deltas. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9154) ProducerId generation should be managed by the Controller
Viktor Somogyi-Vass created KAFKA-9154: -- Summary: ProducerId generation should be managed by the Controller Key: KAFKA-9154 URL: https://issues.apache.org/jira/browse/KAFKA-9154 Project: Kafka Issue Type: Sub-task Components: core Reporter: Viktor Somogyi-Vass Currently producerIds are maintained in Zookeeper but in the future we'd like them to be managed by the controller quorum in an internal topic. The reason for storing this in Zookeeper was that this must be unique across the cluster. In this task it should be refactored such that the TransactionManager turns to the Controller for a ProducerId which connects to Zookeeper to acquire this ID. Since ZK is the single source of truth and the PID won't be cached anywhere it should be safe (just one extra hop added). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9125) GroupMetadataManager and TransactionStateManager should use metadata cache instead of zkClient
Viktor Somogyi-Vass created KAFKA-9125: -- Summary: GroupMetadataManager and TransactionStateManager should use metadata cache instead of zkClient Key: KAFKA-9125 URL: https://issues.apache.org/jira/browse/KAFKA-9125 Project: Kafka Issue Type: Sub-task Reporter: Viktor Somogyi-Vass Both classes query their respective topic's partition count via the zkClient. This however could be queried by the broker's local metadata cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (KAFKA-9124) Topic creation should be done by the controller
[ https://issues.apache.org/jira/browse/KAFKA-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass reopened KAFKA-9124: > Topic creation should be done by the controller > --- > > Key: KAFKA-9124 > URL: https://issues.apache.org/jira/browse/KAFKA-9124 > Project: Kafka > Issue Type: Sub-task >Reporter: Viktor Somogyi-Vass >Priority: Major > > Currently {{KafkaApis}} invokes the {{adminManager}} (which esentially wraps > the zkClient) to create a topic. Instead of this we should create a protocol > that sends a request to the broker and listens for confirmation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-9124) Topic creation should be done by the controller
[ https://issues.apache.org/jira/browse/KAFKA-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-9124. Resolution: Invalid Opened by mistake. > Topic creation should be done by the controller > --- > > Key: KAFKA-9124 > URL: https://issues.apache.org/jira/browse/KAFKA-9124 > Project: Kafka > Issue Type: Sub-task >Reporter: Viktor Somogyi-Vass >Priority: Major > > Currently {{KafkaApis}} invokes the {{adminManager}} (which esentially wraps > the zkClient) to create a topic. Instead of this we should create a protocol > that sends a request to the broker and listens for confirmation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9116) Improve navigation on the kafka site
Viktor Somogyi-Vass created KAFKA-9116: -- Summary: Improve navigation on the kafka site Key: KAFKA-9116 URL: https://issues.apache.org/jira/browse/KAFKA-9116 Project: Kafka Issue Type: Improvement Components: documentation Reporter: Viktor Somogyi-Vass It is often very hard to navigate the Kafka documentation, especially configs because it is a one-pager. It would be useful to somehow shrink its size and make it more readable, for instance by using tabs for the broker, producer, consumer, topic etc. configs (like in https://getbootstrap.com/docs/3.3/components/#nav) but any other idea that makes it better would be fine. It is important that it should stay searchable as the number of configs are huge. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9118) LogDirFailureHandler shouldn't use Zookeeper
Viktor Somogyi-Vass created KAFKA-9118: -- Summary: LogDirFailureHandler shouldn't use Zookeeper Key: KAFKA-9118 URL: https://issues.apache.org/jira/browse/KAFKA-9118 Project: Kafka Issue Type: Improvement Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9124) Topic creation should be done by the controller
Viktor Somogyi-Vass created KAFKA-9124: -- Summary: Topic creation should be done by the controller Key: KAFKA-9124 URL: https://issues.apache.org/jira/browse/KAFKA-9124 Project: Kafka Issue Type: Sub-task Reporter: Viktor Somogyi-Vass Currently {{KafkaApis}} invokes the {{adminManager}} (which esentially wraps the zkClient) to create a topic. Instead of this we should create a protocol that sends a request to the broker and listens for confirmation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-8081) Flaky Test TopicCommandWithAdminClientTest#testDescribeUnderMinIsrPartitions
[ https://issues.apache.org/jira/browse/KAFKA-8081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-8081. Assignee: Viktor Somogyi-Vass Resolution: Fixed > Flaky Test TopicCommandWithAdminClientTest#testDescribeUnderMinIsrPartitions > > > Key: KAFKA-8081 > URL: https://issues.apache.org/jira/browse/KAFKA-8081 > Project: Kafka > Issue Type: Bug > Components: admin, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Assignee: Viktor Somogyi-Vass >Priority: Critical > Labels: flaky-test > Fix For: 2.5.0 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3450/tests] > {quote}java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:87) > at org.junit.Assert.assertTrue(Assert.java:42) > at org.junit.Assert.assertTrue(Assert.java:53) > at > kafka.admin.TopicCommandWithAdminClientTest.testDescribeUnderMinIsrPartitions(TopicCommandWithAdminClientTest.scala:560){quote} > STDOUT > {quote}[2019-03-09 00:13:23,129] ERROR [ReplicaFetcher replicaId=2, > leaderId=0, fetcherId=0] Error for partition > testAlterPartitionCount-yrX0KHVdgf-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-09 00:13:23,130] ERROR [ReplicaFetcher replicaId=3, leaderId=5, > fetcherId=0] Error for partition testAlterPartitionCount-yrX0KHVdgf-0 at > offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-09 00:13:33,184] ERROR [ReplicaFetcher replicaId=4, leaderId=1, > fetcherId=0] Error for partition testAlterPartitionCount-yrX0KHVdgf-2 at > offset 0 (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-09 00:13:36,319] WARN Unable to read additional data from client > sessionid 0x10433c934dc0006, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:13:46,311] WARN Unable to read additional data from client > sessionid 0x10433c98cb10003, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:13:46,356] WARN Unable to read additional data from client > sessionid 0x10433c98cb10004, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:14:06,066] WARN Unable to read additional data from client > sessionid 0x10433c9b17d0005, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:14:06,457] WARN Unable to read additional data from client > sessionid 0x10433c9b17d0001, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:14:11,206] ERROR [ReplicaFetcher replicaId=2, leaderId=3, > fetcherId=0] Error for partition kafka.testTopic1-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-09 00:14:11,218] ERROR [ReplicaFetcher replicaId=4, leaderId=1, > fetcherId=0] Error for partition __consumer_offsets-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-09 00:14:22,096] WARN Unable to read additional data from client > sessionid 0x10433c9f1210004, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:14:28,290] WARN Unable to read additional data from client > sessionid 0x10433ca28de0005, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:14:28,733] WARN Unable to read additional data from client > sessionid 0x10433ca28de0006, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:14:29,529] WARN Unable to read additional data from client > sessionid 0x10433ca28de, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:14:31,841] WARN Unable to read additional data from client > sessionid 0x10433ca39ed0002, likely client has closed socket > (org.apache.zookeeper.server.NIOServerCnxn:376) > [2019-03-09 00:14:40,221] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition > testCreateAlterTopicWithRackAware-IQe98agDrW-16 at offset 0 > (kafka.server.ReplicaFetcherThread:76) >
[jira] [Created] (KAFKA-9049) TopicCommandWithAdminClientTest should use mocks
Viktor Somogyi-Vass created KAFKA-9049: -- Summary: TopicCommandWithAdminClientTest should use mocks Key: KAFKA-9049 URL: https://issues.apache.org/jira/browse/KAFKA-9049 Project: Kafka Issue Type: Improvement Components: unit tests Affects Versions: 2.5.0 Reporter: Viktor Somogyi-Vass The {{TopicCommandWithAdminClientTest}} class currently sets up a few brokers for every test case which is wasteful and slow. We should improve it by mocking out the broker behavior (maybe use {{MockAdminClient}}?). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9059) Implement MaxReassignmentLag
Viktor Somogyi-Vass created KAFKA-9059: -- Summary: Implement MaxReassignmentLag Key: KAFKA-9059 URL: https://issues.apache.org/jira/browse/KAFKA-9059 Project: Kafka Issue Type: Improvement Affects Versions: 2.5.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Some more thinking is required to the implementation of MaxReassignmentLag proposed by [KIP-352|https://cwiki.apache.org/confluence/display/KAFKA/KIP-352:+Distinguish+URPs+caused+by+reassignment] as it's not so straightforward (need to maintain partitions' lag), therefore we separated it off into this JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9015) Unify metric names
Viktor Somogyi-Vass created KAFKA-9015: -- Summary: Unify metric names Key: KAFKA-9015 URL: https://issues.apache.org/jira/browse/KAFKA-9015 Project: Kafka Issue Type: Improvement Components: metrics Affects Versions: 2.5.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Some of the metrics use a lower-case style metrics name like "i-am-a-metric" while the majority uses UpperCamelCase (IAmAnotherMetric). We need consistency across the project and since the majority of the metrics uses the camel case notation, we need to change the others. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9016) Warn when log dir stopped serving replicas
Viktor Somogyi-Vass created KAFKA-9016: -- Summary: Warn when log dir stopped serving replicas Key: KAFKA-9016 URL: https://issues.apache.org/jira/browse/KAFKA-9016 Project: Kafka Issue Type: Improvement Components: log, log cleaner Reporter: Viktor Somogyi-Vass Kafka should warn if the log directory stops serving replicas as usually it is due to an error and it's much visible if it's on warn level. {noformat} 2019-09-19 12:39:54,194 ERROR kafka.server.LogDirFailureChannel: Error while writing to checkpoint file /kafka/data/diskX/replication-offset-checkpoint java.io.SyncFailedException: sync failed .. 2019-09-19 12:39:54,197 INFO kafka.server.ReplicaManager: [ReplicaManager broker=638] Stopping serving replicas in dir /kafka/data/diskX .. 2019-09-19 12:39:54,205 INFO kafka.server.ReplicaFetcherManager: [ReplicaFetcherManager on broker 638] Removed fetcher for partitions Set(test1-0, test2-2, test-0, test2-2, test4-14, test4-0, test2-6) 2019-09-19 12:39:54,206 INFO kafka.server.ReplicaAlterLogDirsManager: [ReplicaAlterLogDirsManager on broker 638] Removed fetcher for partitions Set(test1-0, test2-2, test-0, test3-2, test4-14, test4-0, test2-6) 2019-09-19 12:39:54,222 INFO kafka.server.ReplicaManager: [ReplicaManager broker=638] Broker 638 stopped fetcher for partitions test1-0,test2-2,test-0,test3-2,test4-14,test4-0,test2-6 and stopped moving logs for partitions because they are in the failed log directory /kafka/data/diskX. {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9814) Allow AdminClient.listTopics to list based on expression
Viktor Somogyi-Vass created KAFKA-9814: -- Summary: Allow AdminClient.listTopics to list based on expression Key: KAFKA-9814 URL: https://issues.apache.org/jira/browse/KAFKA-9814 Project: Kafka Issue Type: Improvement Affects Versions: 2.5.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Currently listTopics would list all the topics in the cluster which often isn't desired due to the large number of topics and because the authorizer would check access for every topic which could result in various bloated access/audit logs that would make them hard to read and also in case of many topics it isn't really effective to return all of them due to the response's size. To take an example let's have a user who issues a naive command such as: {noformat} kafka-topics --list --topic topic6 --bootstrap-server mycompany.com:9092 --command-config client.properties {noformat} In this case the admin client would issue a metadata request for all the topics. When this request gets to the broker then it will try to check permissions to every topic in the cluster and then return those which are allowed for "describe" to the client. Then the command when it gets the answer will simply filter out {{topic6}} from the response. Looking at this problem we can see that if we allow the broker to do the filtering then we can: * do less permission checks * return less data to the clients Regarding the solution I see two path but both of them of course subject community discussion: # modify the MetadataRequest and put some kind of regex in it # create a new list API that would contain this filter pattern Perhaps the list API would make more sense as the MetadataRequest is currently used because it just fits but there is no real reason to extend it with this as the functionality won't be used elsewhere. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-7805) Use --bootstrap-server option for kafka-topics.sh in ducktape tests where applicable
[ https://issues.apache.org/jira/browse/KAFKA-7805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-7805. Resolution: Fixed > Use --bootstrap-server option for kafka-topics.sh in ducktape tests where > applicable > > > Key: KAFKA-7805 > URL: https://issues.apache.org/jira/browse/KAFKA-7805 > Project: Kafka > Issue Type: Improvement > Components: system tests >Reporter: Viktor Somogyi-Vass >Priority: Major > > KIP-377 introduces the {{--bootstrap-server}} option and deprecates the > {{--zookeeper}} option in {{kafka-topics.sh}}. I'd be nice to use the new > option in the ducktape tests to gain higher test coverage. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10650) Use Murmur3 hashing instead of MD5 in SkimpyOffsetMap
Viktor Somogyi-Vass created KAFKA-10650: --- Summary: Use Murmur3 hashing instead of MD5 in SkimpyOffsetMap Key: KAFKA-10650 URL: https://issues.apache.org/jira/browse/KAFKA-10650 Project: Kafka Issue Type: Improvement Components: core Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass The usage of MD5 has been uncovered during testing Kafka for FIPS (Federal Information Processing Standards) verification. While MD5 isn't a FIPS incompatibility here as it isn't used for cryptographic purposes, I spent some time with this as it isn't ideal either. MD5 is a relatively fast crypto hashing algo but there are much better performing algorithms for hash tables as it's used in SkimpyOffsetMap. By applying Murmur3 (that is implemented in Streams) I could achieve a 3x faster {{put}} operation and the overall segment cleaning sped up by 30% while preserving the same collision rate (both performed within 0.0015 - 0.007, mostly with 0.004 median). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-12922) MirrorCheckpointTask should close topic filter
Viktor Somogyi-Vass created KAFKA-12922: --- Summary: MirrorCheckpointTask should close topic filter Key: KAFKA-12922 URL: https://issues.apache.org/jira/browse/KAFKA-12922 Project: Kafka Issue Type: Improvement Components: mirrormaker Affects Versions: 2.8.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass When a lot of connectors are restarted it turned out that underlying ConfigConsumers are not closed property and from the logs we can see that the old ones are still running. Turns out that MirrorCheckpointTask utilizes a TopicFilter, but never closes it, leaking resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-13240) HTTP TRACE should be disabled in Connect
Viktor Somogyi-Vass created KAFKA-13240: --- Summary: HTTP TRACE should be disabled in Connect Key: KAFKA-13240 URL: https://issues.apache.org/jira/browse/KAFKA-13240 Project: Kafka Issue Type: Improvement Components: KafkaConnect Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Modern browsers mostly disable HTTP TRACE to prevent XST (cross-site tracking) attacks. Because of this usually this type of attack isn't too prevalent these days but since it isn't disabled in Connect it may open up possible ways of attacks (and constantly pops up in security scans :) ). Therefore we'd like to disable it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-13442) REST API endpoint for fetching a connector's config definition
Viktor Somogyi-Vass created KAFKA-13442: --- Summary: REST API endpoint for fetching a connector's config definition Key: KAFKA-13442 URL: https://issues.apache.org/jira/browse/KAFKA-13442 Project: Kafka Issue Type: Improvement Components: KafkaConnect Affects Versions: 3.2.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass To enhance UI based applications' capability in helping users to create new connectors from default configurations, it would be very good to have an API which can fetch a connector type's configuration definition which will be filled out by users and sent back for validation and then creating a new connector out of it. The API should be placed under {{connector-plugins}} and since {{connector-plugins/\{connectorType\}/config/validate}} already exists, {{connector-plugins/\{connectorType\}/config}} might be a good option for the new API. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13917) Avoid calling lookupCoordinator() in tight loop
Viktor Somogyi-Vass created KAFKA-13917: --- Summary: Avoid calling lookupCoordinator() in tight loop Key: KAFKA-13917 URL: https://issues.apache.org/jira/browse/KAFKA-13917 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 3.1.1, 3.1.0, 3.1.2 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Currently the heartbeat thread's lookupCoordinator() is called in a tight loop if brokers crash and the consumer is left running. Besides that it floods the logs on debug level, it increases CPU usage as well. The fix is easy, just need to put a backoff call after coordinator lookup. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KAFKA-13949) Connect /connectors endpoint should support querying the active topics and the task configs
Viktor Somogyi-Vass created KAFKA-13949: --- Summary: Connect /connectors endpoint should support querying the active topics and the task configs Key: KAFKA-13949 URL: https://issues.apache.org/jira/browse/KAFKA-13949 Project: Kafka Issue Type: Improvement Components: KafkaConnect Affects Versions: 3.2.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass The /connectors endpoint supports the "expand" query parameter, which acts as a set of queried categories, currently supporting info (config) and status (monitoring status). The endpoint should also support adding the active topics of a connector, and adding the separate task configs, too. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Reopened] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users
[ https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass reopened KAFKA-6945: > Add support to allow users to acquire delegation tokens for other users > --- > > Key: KAFKA-6945 > URL: https://issues.apache.org/jira/browse/KAFKA-6945 > Project: Kafka > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Manikumar >Assignee: Viktor Somogyi-Vass >Priority: Major > Labels: needs-kip > Fix For: 3.3.0 > > > Currently, we only allow a user to create delegation token for that user > only. > We should allow users to acquire delegation tokens for other users. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-13452) MM2 creates invalid checkpoint when offset mapping is not available
[ https://issues.apache.org/jira/browse/KAFKA-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-13452. - Resolution: Duplicate > MM2 creates invalid checkpoint when offset mapping is not available > --- > > Key: KAFKA-13452 > URL: https://issues.apache.org/jira/browse/KAFKA-13452 > Project: Kafka > Issue Type: Improvement > Components: mirrormaker >Reporter: Daniel Urban >Assignee: Viktor Somogyi-Vass >Priority: Major > > MM2 checkpointing reads the offset-syncs topic to create offset mappings for > committed consumer group offsets. In some corner cases, it is possible that a > mapping is not available in offset-syncs - in that case, MM2 simply copies > the source offset, which might not be a valid offset in the replica topic at > all. > One possible situation is if there is an empty topic in the source cluster > with a non-zero endoffset (e.g. retention already removed the records), and a > consumer group which has a committed offset set to the end offset. If > replication is configured to start replicating this topic, it will not have > an offset mapping available in offset-syncs (as the topic is empty), causing > MM2 to copy the source offset. > This can cause issues when auto offset sync is enabled, as the consumer group > offset can be potentially set to a high number. MM2 never rewinds these > offsets, so even when there is a correct offset mapping available, the offset > will not be updated correctly. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-6084) ReassignPartitionsCommand should propagate JSON parsing failures
[ https://issues.apache.org/jira/browse/KAFKA-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-6084. Fix Version/s: 2.8.0 Resolution: Fixed > ReassignPartitionsCommand should propagate JSON parsing failures > > > Key: KAFKA-6084 > URL: https://issues.apache.org/jira/browse/KAFKA-6084 > Project: Kafka > Issue Type: Improvement > Components: admin >Affects Versions: 0.11.0.0 >Reporter: Viktor Somogyi-Vass >Assignee: Viktor Somogyi-Vass >Priority: Minor > Labels: easyfix, newbie > Fix For: 2.8.0 > > Attachments: Screen Shot 2017-10-18 at 23.31.22.png > > > Basically looking at Json.scala it will always swallow any parsing errors: > {code} > def parseFull(input: String): Option[JsonValue] = > try Option(mapper.readTree(input)).map(JsonValue(_)) > catch { case _: JsonProcessingException => None } > {code} > However sometimes it is easy to figure out the problem by simply looking at > the JSON, in some cases it is not very trivial, such as some invisible > characters (like byte order mark) won't be displayed by most of the text > editors and can people spend time on figuring out what's the problem. > As Jackson provides a really detailed exception about what failed and how, it > is easy to propagate the failure to the user. > As an example I attached a BOM prefixed JSON which fails with the following > error which is very counterintuitive: > {noformat} > [root@localhost ~]# kafka-reassign-partitions --zookeeper localhost:2181 > --reassignment-json-file /root/increase-replication-factor.json --execute > Partitions reassignment failed due to Partition reassignment data file > /root/increase-replication-factor.json is empty > kafka.common.AdminCommandFailedException: Partition reassignment data file > /root/increase-replication-factor.json is empty > at > kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:120) > at > kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:52) > at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) > ... > {noformat} > In case of the above error it would be much better to see what fails exactly: > {noformat} > kafka.common.AdminCommandFailedException: Admin command failed > at > kafka.admin.ReassignPartitionsCommand$.parsePartitionReassignmentData(ReassignPartitionsCommand.scala:267) > at > kafka.admin.ReassignPartitionsCommand$.parseAndValidate(ReassignPartitionsCommand.scala:275) > at > kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:197) > at > kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:193) > at > kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:64) > at > kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala) > Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected > character ('' (code 65279 / 0xfeff)): expected a valid value (number, > String, array, object, 'true', 'false' or 'null') > at [Source: (String)"{"version":1, > "partitions":[ >{"topic": "test1", "partition": 0, "replicas": [1,2]}, >{"topic": "test2", "partition": 1, "replicas": [2,3]} > ]}"; line: 1, column: 2] > at > com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1798) > at > com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:663) > at > com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:561) > at > com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1892) > at > com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:747) > at > com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4030) > at > com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2539) > at kafka.utils.Json$.kafka$utils$Json$$doParseFull(Json.scala:46) > at kafka.utils.Json$$anonfun$tryParseFull$1.apply(Json.scala:44) > at kafka.utils.Json$$anonfun$tryParseFull$1.apply(Json.scala:44) > at scala.util.Try$.apply(Try.scala:192) > at kafka.utils.Json$.tryParseFull(Json.scala:44) > at > kafka.admin.ReassignPartitionsCommand$.parsePartitionReassignmentData(ReassignPartitionsCommand.scala:241) > ... 5 more > {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-13442) REST API endpoint for fetching a connector's config definition
[ https://issues.apache.org/jira/browse/KAFKA-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-13442. - Resolution: Duplicate > REST API endpoint for fetching a connector's config definition > -- > > Key: KAFKA-13442 > URL: https://issues.apache.org/jira/browse/KAFKA-13442 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Affects Versions: 3.2.0 >Reporter: Viktor Somogyi-Vass >Assignee: Viktor Somogyi-Vass >Priority: Major > > To enhance UI based applications' capability in helping users to create new > connectors from default configurations, it would be very good to have an API > which can fetch a connector type's configuration definition which will be > filled out by users and sent back for validation and then creating a new > connector out of it. > The API should be placed under {{connector-plugins}} and since > {{connector-plugins/\{connectorType\}/config/validate}} already exists, > {{connector-plugins/\{connectorType\}/config}} might be a good option for the > new API. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (KAFKA-14331) Upgrade to Scala 2.13.10
[ https://issues.apache.org/jira/browse/KAFKA-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-14331. - Resolution: Duplicate > Upgrade to Scala 2.13.10 > > > Key: KAFKA-14331 > URL: https://issues.apache.org/jira/browse/KAFKA-14331 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 3.4.0 >Reporter: Viktor Somogyi-Vass >Priority: Major > > There are some CVEs in Scala 2.13.8, so we should upgrade to the latest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14331) Upgrade to Scala 2.13.10
Viktor Somogyi-Vass created KAFKA-14331: --- Summary: Upgrade to Scala 2.13.10 Key: KAFKA-14331 URL: https://issues.apache.org/jira/browse/KAFKA-14331 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 3.4.0 Reporter: Viktor Somogyi-Vass There are some CVEs in Scala 2.13.8, so we should upgrade to the latest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14250) Exception during normal operation in MirrorSourceTask causes the task to fail instead of shutting down gracefully
Viktor Somogyi-Vass created KAFKA-14250: --- Summary: Exception during normal operation in MirrorSourceTask causes the task to fail instead of shutting down gracefully Key: KAFKA-14250 URL: https://issues.apache.org/jira/browse/KAFKA-14250 Project: Kafka Issue Type: Bug Components: KafkaConnect, mirrormaker Affects Versions: 3.3 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass In MirrorSourceTask we are loading offsets for the topic partitions. At this point, while we are fetching the partitions, it is possible for the offset reader to be stopped by a parallel thread. Stopping the reader causes a CancellationException to be thrown, due to KAFKA-9051. Currently this exception is not caught in MirrorSourceTask and so the exception propagates up and causes the task to go into FAILED state. We only need it to go to STOPPED state so that it would be restarted later. This can be achieved by catching the exception and stopping the task directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14281) Multi-level rack awareness
Viktor Somogyi-Vass created KAFKA-14281: --- Summary: Multi-level rack awareness Key: KAFKA-14281 URL: https://issues.apache.org/jira/browse/KAFKA-14281 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 3.4.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass h1. Motivation With replication services data can be replicated across independent Kafka clusters in multiple data center. In addition, many customers need "stretch clusters" - a single Kafka cluster that spans across multiple data centers. This architecture has the following useful characteristics: - Data is natively replicated into all data centers by Kafka topic replication. - No data is lost when 1 DC is lost and no configuration change is required - design is implicitly relying on native Kafka replication. - From operational point of view, it is much easier to configure and operate such a topology than a replication scenario via MM2. Kafka should provide "native" support for stretch clusters, covering any special aspects of operations of stretch cluster. h2. Multi-level rack awareness Additionally, stretch clusters are implemented using the rack awareness feature, where each DC is represented as a rack. This ensures that replicas are spread across DCs evenly. Unfortunately, there are cases where this is too limiting - in case there are actual racks inside the DCs, we cannot specify those. Consider having 3 DCs with 2 racks each: /DC1/R1, /DC1/R2 /DC2/R1, /DC2/R2 /DC3/R1, /DC3/R2 If we were to use racks as DC1, DC2, DC3, we lose the rack-level information of the setup. This means that it is possible that when we are using RF=6, that the 2 replicas assigned to DC1 will both end up in the same rack. If we were to use racks as /DC1/R1, /DC1/R2, etc, then when using RF=3, it is possible that 2 replicas end up in the same DC, e.g. /DC1/R1, /DC1/R2, /DC2/R1. Because of this, Kafka should support "multi-level" racks, which means that rack IDs should be able to describe some kind of a hierarchy. With this feature, brokers should be able to: # spread replicas evenly based on the top level of the hierarchy (i.e. first, between DCs) # then inside a top-level unit (DC), if there are multiple replicas, they should be spread evenly among lower-level units (i.e. between racks, then between physical hosts, and so on) ## repeat for all levels -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14929) Flaky KafkaStatusBackingStoreFormatTest#putTopicStateRetriableFailure
[ https://issues.apache.org/jira/browse/KAFKA-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-14929. - Resolution: Fixed > Flaky KafkaStatusBackingStoreFormatTest#putTopicStateRetriableFailure > - > > Key: KAFKA-14929 > URL: https://issues.apache.org/jira/browse/KAFKA-14929 > Project: Kafka > Issue Type: Test > Components: KafkaConnect >Reporter: Greg Harris >Assignee: Sagar Rao >Priority: Major > Labels: flaky-test > Fix For: 3.5.0 > > > This test recently started flaky-failing with the following stack trace: > {noformat} > org.mockito.exceptions.verification.TooFewActualInvocations: > kafkaBasedLog.send(, , ); > Wanted 2 times:-> > at org.apache.kafka.connect.util.KafkaBasedLog.send(KafkaBasedLog.java:376) > But was 1 time:-> > at > org.apache.kafka.connect.storage.KafkaStatusBackingStore.sendTopicStatus(KafkaStatusBackingStore.java:315) > at > app//org.apache.kafka.connect.util.KafkaBasedLog.send(KafkaBasedLog.java:376) > at > app//org.apache.kafka.connect.storage.KafkaStatusBackingStoreFormatTest.putTopicStateRetriableFailure(KafkaStatusBackingStoreFormatTest.java:219) > at > java.base@11.0.16.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base@11.0.16.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base@11.0.16.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base@11.0.16.1/java.lang.reflect.Method.invoke(Method.java:566) > ...{noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15161) InvalidReplicationFactorException at connect startup
Viktor Somogyi-Vass created KAFKA-15161: --- Summary: InvalidReplicationFactorException at connect startup Key: KAFKA-15161 URL: https://issues.apache.org/jira/browse/KAFKA-15161 Project: Kafka Issue Type: Improvement Components: clients, KafkaConnect Affects Versions: 3.6.0 Reporter: Viktor Somogyi-Vass .h2 Problem description In our system test environment in certain cases due to a very specific timing issue Connect may fail to start up. the problem lies in the very specific timing of a Kafka cluster and connect start/restart. In these cases while the broker doesn't have metadata and a consumer in connect starts and asks for topic metadata, it returns the following exception and fails: {noformat} [2023-07-07 13:56:47,994] ERROR [Worker clientId=connect-1, groupId=connect-cluster] Uncaught exception in herder work thread, exiting: (org.apache.kafka.connect.runtime.distributed.DistributedHerder) org.apache.kafka.common.KafkaException: Unexpected error fetching metadata for topic connect-offsets at org.apache.kafka.clients.consumer.internals.TopicMetadataFetcher.getTopicMetadata(TopicMetadataFetcher.java:130) at org.apache.kafka.clients.consumer.internals.TopicMetadataFetcher.getTopicMetadata(TopicMetadataFetcher.java:66) at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:2001) at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1969) at org.apache.kafka.connect.util.KafkaBasedLog.start(KafkaBasedLog.java:251) at org.apache.kafka.connect.storage.KafkaOffsetBackingStore.start(KafkaOffsetBackingStore.java:242) at org.apache.kafka.connect.runtime.Worker.start(Worker.java:230) at org.apache.kafka.connect.runtime.AbstractHerder.startServices(AbstractHerder.java:151) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:363) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor is below 1 or larger than the number of available brokers. {noformat} Due to this error the connect node stops and it has to be manually restarted (and ofc it fails the test scenarios as well). .h2 Reproduction In my test scenario I had: - 1 broker - 1 connect distributed node - I also had a patch that I applied on the broker to make sure we don't have metadata Steps to repro: # start up a zookeeper based broker without the patch # put a breakpoint here: https://github.com/apache/kafka/blob/1d8b07ed6435568d3daf514c2d902107436d2ac8/clients/src/main/java/org/apache/kafka/clients/consumer/internals/TopicMetadataFetcher.java#L94 # start up a distributed connect node # restart the kafka broker with the patch to make sure there is no metadata # once the broker is started, release the debugger in connect It should run into the error cited above and shut down. This is not desirable, the connect cluster should retry to ensure its continuous operation or the broker should handle this case somehow differently, for instance by returning a RetriableException. The earliest I've tried this is 2.8 but I think this affects versions before that as well (and after). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15219) Support delegation tokens in KRaft
Viktor Somogyi-Vass created KAFKA-15219: --- Summary: Support delegation tokens in KRaft Key: KAFKA-15219 URL: https://issues.apache.org/jira/browse/KAFKA-15219 Project: Kafka Issue Type: Improvement Affects Versions: 3.6.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass Delegation tokens have been created in KIP-48 and improved in KIP-373. KRaft enabled the way to supporting them in KIP-900 by adding SCRAM support but delegation tokens still don't support KRaft. There are multiple issues: - TokenManager still would try to create tokens in Zookeeper. Instead of this we should forward admin requests to the controller that would store them in the metadata similarly to SCRAM. - TokenManager should run on Controller nodes only (or in mixed mode). - Integration tests will need to be adapted as well and parameterize them with Zookeeper/KRaft. - Documentation needs to be improved to factor in KRaft. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-12384) Flaky Test ListOffsetsRequestTest.testResponseIncludesLeaderEpoch
[ https://issues.apache.org/jira/browse/KAFKA-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-12384. - Fix Version/s: 3.6.0 Resolution: Fixed > Flaky Test ListOffsetsRequestTest.testResponseIncludesLeaderEpoch > - > > Key: KAFKA-12384 > URL: https://issues.apache.org/jira/browse/KAFKA-12384 > Project: Kafka > Issue Type: Test > Components: core, unit tests >Reporter: Matthias J. Sax >Assignee: Chia-Ping Tsai >Priority: Critical > Labels: flaky-test > Fix For: 3.6.0, 3.0.0 > > > {quote}org.opentest4j.AssertionFailedError: expected: <(0,0)> but was: > <(-1,-1)> at > org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at > org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182) at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177) at > org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1124) at > kafka.server.ListOffsetsRequestTest.testResponseIncludesLeaderEpoch(ListOffsetsRequestTest.scala:172){quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15059) Exactly-once source tasks fail to start during pending rebalances
[ https://issues.apache.org/jira/browse/KAFKA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-15059. - Resolution: Fixed [~ChrisEgerton] since the PR is merged, I resolve this ticket. > Exactly-once source tasks fail to start during pending rebalances > - > > Key: KAFKA-15059 > URL: https://issues.apache.org/jira/browse/KAFKA-15059 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect, mirrormaker >Affects Versions: 3.6.0 >Reporter: Chris Egerton >Assignee: Chris Egerton >Priority: Blocker > Fix For: 3.6.0 > > > When asked to perform a round of zombie fencing, the distributed herder will > [reject the > request|https://github.com/apache/kafka/blob/17fd30e6b457f097f6a524b516eca1a6a74a9144/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L1249-L1250] > if a rebalance is pending, which can happen if (among other things) a config > for a new connector or a new set of task configs has been recently read from > the config topic. > Normally this can be alleviated with a simple task restart, which isn't great > but isn't terrible. > However, when running MirrorMaker 2 in dedicated mode, there is no API to > restart failed tasks, and it can be more common to see this kind of failure > on a fresh cluster because three connector configurations are written in > rapid succession to the config topic. > > In order to provide a better experience for users of both vanilla Kafka > Connect and dedicated MirrorMaker 2 clusters, we can retry (likely with the > same exponential backoff introduced with KAFKA-14732) zombie fencing attempts > that fail due to a pending rebalance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15992) Make MM2 heartbeats topic name configurable
Viktor Somogyi-Vass created KAFKA-15992: --- Summary: Make MM2 heartbeats topic name configurable Key: KAFKA-15992 URL: https://issues.apache.org/jira/browse/KAFKA-15992 Project: Kafka Issue Type: Improvement Affects Versions: 3.7.0 Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass With DefaultReplicationPolicy, the heartbeats topic name is hard-coded. Instead, this should be configurable, so users can avoid collisions with the "heartbeats" topics of other systems. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-16596) Flaky test – org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
[ https://issues.apache.org/jira/browse/KAFKA-16596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass resolved KAFKA-16596. - Fix Version/s: 3.8.0 Assignee: Andras Katona Resolution: Fixed > Flaky test – > org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup() > > --- > > Key: KAFKA-16596 > URL: https://issues.apache.org/jira/browse/KAFKA-16596 > Project: Kafka > Issue Type: Test >Reporter: Igor Soarez >Assignee: Andras Katona >Priority: Major > Labels: GoodForNewContributors, good-first-issue > Fix For: 3.8.0 > > > org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup() > failed in the following way: > > {code:java} > org.opentest4j.AssertionFailedError: Unexpected addresses [93.184.215.14, > 2606:2800:21f:cb07:6820:80da:af6b:8b2c] ==> expected: but was: > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) >at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) >at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:214) > at > app//org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup(ClientUtilsTest.java:65) > {code} > As a result of the following assertions: > > {code:java} > // With lookup of example.com, either one or two addresses are expected > depending on > // whether ipv4 and ipv6 are enabled > List validatedAddresses = > checkWithLookup(asList("example.com:1")); > assertTrue(validatedAddresses.size() >= 1, "Unexpected addresses " + > validatedAddresses); > List validatedHostNames = > validatedAddresses.stream().map(InetSocketAddress::getHostName) > .collect(Collectors.toList()); > List expectedHostNames = asList("93.184.216.34", > "2606:2800:220:1:248:1893:25c8:1946"); {code} > It seems that the DNS result has changed for example.com. > -- This message was sent by Atlassian Jira (v8.20.10#820010)