[jira] [Commented] (KAFKA-7737) Consolidate InitProducerId API
[ https://issues.apache.org/jira/browse/KAFKA-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008780#comment-17008780 ] Viktor Somogyi-Vass commented on KAFKA-7737: [~hachikuji] sure, and I'll be happy to help in the reviews once your PR is ready for it! > Consolidate InitProducerId API > -- > > Key: KAFKA-7737 > URL: https://issues.apache.org/jira/browse/KAFKA-7737 > Project: Kafka > Issue Type: Task > Components: producer >Reporter: Viktor Somogyi-Vass >Assignee: Viktor Somogyi-Vass >Priority: Minor > Labels: exactly-once > > We have two separate paths in the producer for the InitProducerId API: one > for the transactional producer and one for the idempotent producer. It would > be nice to find a way to consolidate these. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-7737) Consolidate InitProducerId API
[ https://issues.apache.org/jira/browse/KAFKA-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass reassigned KAFKA-7737: -- Assignee: Jason Gustafson (was: Viktor Somogyi-Vass) > Consolidate InitProducerId API > -- > > Key: KAFKA-7737 > URL: https://issues.apache.org/jira/browse/KAFKA-7737 > Project: Kafka > Issue Type: Task > Components: producer >Reporter: Viktor Somogyi-Vass >Assignee: Jason Gustafson >Priority: Minor > Labels: exactly-once > > We have two separate paths in the producer for the InitProducerId API: one > for the transactional producer and one for the idempotent producer. It would > be nice to find a way to consolidate these. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9366) please consider upgrade log4j to log4j2 due to critical security problem CVE-2019-17571
[ https://issues.apache.org/jira/browse/KAFKA-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008938#comment-17008938 ] ASF GitHub Bot commented on KAFKA-9366: --- dongjinleekr commented on pull request #7898: KAFKA-9366: please consider upgrade log4j to log4j2 due to critical security problem CVE-2019-17571 URL: https://github.com/apache/kafka/pull/7898 This PR changes log4j dependency into log4j2. log4j migrated into log4j2 after its last release of 1.2.17 (May 2012), which is affected by this problem. So, the only way to fix it is by moving log4j dependency into log4j2. The problem is: the API for setting log level dynamically is different between log4j and log4j2. So, this PR also updates how `Log4jController` works. This PR also fixes a potential problem in `Log4jController#getLogLevel` - what if the root logger's level is null? It may result in `NullPointerException`. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > please consider upgrade log4j to log4j2 due to critical security problem > CVE-2019-17571 > --- > > Key: KAFKA-9366 > URL: https://issues.apache.org/jira/browse/KAFKA-9366 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.2.0, 2.1.1, 2.3.0, 2.4.0 >Reporter: leibo >Priority: Critical > > h2. CVE-2019-17571 Detail > Included in Log4j 1.2 is a SocketServer class that is vulnerable to > deserialization of untrusted data which can be exploited to remotely execute > arbitrary code when combined with a deserialization gadget when listening to > untrusted network traffic for log data. This affects Log4j versions up to 1.2 > up to 1.2.17. > > [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17571] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9366) please consider upgrade log4j to log4j2 due to critical security problem CVE-2019-17571
[ https://issues.apache.org/jira/browse/KAFKA-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjin Lee reassigned KAFKA-9366: -- Assignee: Dongjin Lee > please consider upgrade log4j to log4j2 due to critical security problem > CVE-2019-17571 > --- > > Key: KAFKA-9366 > URL: https://issues.apache.org/jira/browse/KAFKA-9366 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.2.0, 2.1.1, 2.3.0, 2.4.0 >Reporter: leibo >Assignee: Dongjin Lee >Priority: Critical > > h2. CVE-2019-17571 Detail > Included in Log4j 1.2 is a SocketServer class that is vulnerable to > deserialization of untrusted data which can be exploited to remotely execute > arbitrary code when combined with a deserialization gadget when listening to > untrusted network traffic for log data. This affects Log4j versions up to 1.2 > up to 1.2.17. > > [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17571] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9367) CRC failure
Shivangi Singh created KAFKA-9367: - Summary: CRC failure Key: KAFKA-9367 URL: https://issues.apache.org/jira/browse/KAFKA-9367 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.0.0 Reporter: Shivangi Singh We have a 14 node kafka(2.0.0) cluster In our case *Leader* : *Broker Id* : 1003 *Ip*: 10.84.198.238 *Replica* : *Broker Id* : 1014 *Ip*: 10.22.2.74 A request was sent from replica -> leader to which leader(10.84.198.238) had the following exception var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:13:04,386] ERROR Closing socket for 10.84.198.238:6667-10.22.2.74:53118-121025 because of error (kafka.network.Processor) /var/log/kafka/server.log.2019-12-26-00-org.apache.kafka.common.errors.InvalidRequestException: Error getting request for apiKey: FETCH, apiVersion: 8, connectionId: 10.84.198.238:6667-10.22.2.74:53118-121025, listenerName: ListenerName(PLAINTEXT), principal: User:ANONYMOUS /var/log/kafka/server.log.2019-12-26-00-Caused by: org.apache.kafka.common.protocol.types.SchemaException: *Error reading field 'forgotten_topics_data':* Error reading array of size 23668, only 69 bytes available /var/log/kafka/server.log.2019-12-26-00- at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:77) /var/log/kafka/server.log.2019-12-26-00- at org.apache.kafka.common.protocol.ApiKeys.parseRequest(ApiKeys.java:290) /var/log/kafka/server.log.2019-12-26-00- at org.apache.kafka.common.requests.RequestContext.parseRequest(RequestContext.java:63) In response to this, replica (10.22.2.74) had the following log in it [2019-12-26 00:13:04,390] WARN [ReplicaFetcher replicaId=1014, leaderId=1003, fetcherId=0] Error in response for fetch request (type=FetchRequest, replicaId=1014, maxWait=500, minBytes=1, maxBytes=10485760, fetchData={_topic_name_=(offset=50344687, logStartOffset=24957467, maxBytes=1048576)}, isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=1747349875, epoch=183382033)) (kafka.server.ReplicaFetcherThread) java.io.IOException: Connection to 1003 was disconnected before the response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:96) at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:240) at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:43) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:149) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:114) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) Post this broker 1003 had the following exception /var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:16:37,828] ERROR [ReplicaFetcher replicaId=1003, leaderId=1014, fetcherId=0] Found invalid messages during etch for partition _topic_name_ offset 91200983 (kafka.server.ReplicaFetcherThread) /var/log/kafka/server.log.2019-12-26-00-*org.apache.kafka.common.record.InvalidRecordException: Record is corrupt (stored crc = 1460037823, computed crc = 114378201)* /var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:16:40,690] ERROR Closing socket for 10.84.198.238:6667-10.22.2.74:49850-740543 because of error (kafka.network.Processor) /var/log/kafka/server.log.2019-12-26-00-org.apache.kafka.common.errors.InvalidRequestException: Error getting request for apiKey: FETCH, apiVersion: 8, connectionId: 10.84.198.238:6667-10.22.2.74:49850-740543, listenerName: ListenerName(PLAINTEXT), principal: User:ANONYMOUS Could you help us with the above issue? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9367) CRC failure
[ https://issues.apache.org/jira/browse/KAFKA-9367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009032#comment-17009032 ] Shivangi Singh commented on KAFKA-9367: --- We are also witnessing following exceptions [2020-01-06 21:42:12,375] ERROR [ReplicaFetcher replicaId=1009, leaderId=1006, fetcherId=0] Found invalid messages during fetch for partition _topic_name_ offset 909070901 (kafka.server.ReplicaFetcherThread) org.apache.kafka.common.errors.CorruptRecordException: Invalid magic found in record: 117 [2020-01-06 21:42:13,417] ERROR [ReplicaFetcher replicaId=1009, leaderId=1006, fetcherId=0] Found invalid messages during fetch for partition payments_payment_data_flow_instances-26 offset 909070901 (kafka.server.ReplicaFetcherThread) org.apache.kafka.common.record.InvalidRecordException: Record is corrupt (stored crc = 3006481486, computed crc = 2816728803) > CRC failure > > > Key: KAFKA-9367 > URL: https://issues.apache.org/jira/browse/KAFKA-9367 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.0.0 >Reporter: Shivangi Singh >Priority: Major > > We have a 14 node kafka(2.0.0) cluster > In our case > *Leader* : *Broker Id* : 1003 *Ip*: 10.84.198.238 > *Replica* : *Broker Id* : 1014 *Ip*: 10.22.2.74 > A request was sent from replica -> leader to which leader(10.84.198.238) had > the following exception > var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:13:04,386] ERROR > Closing socket for 10.84.198.238:6667-10.22.2.74:53118-121025 because of > error (kafka.network.Processor) > /var/log/kafka/server.log.2019-12-26-00-org.apache.kafka.common.errors.InvalidRequestException: > Error getting request for apiKey: FETCH, apiVersion: 8, connectionId: > 10.84.198.238:6667-10.22.2.74:53118-121025, listenerName: > ListenerName(PLAINTEXT), principal: User:ANONYMOUS > /var/log/kafka/server.log.2019-12-26-00-Caused by: > org.apache.kafka.common.protocol.types.SchemaException: *Error reading field > 'forgotten_topics_data':* Error reading array of size 23668, only 69 bytes > available > /var/log/kafka/server.log.2019-12-26-00- at > org.apache.kafka.common.protocol.types.Schema.read(Schema.java:77) > /var/log/kafka/server.log.2019-12-26-00- at > org.apache.kafka.common.protocol.ApiKeys.parseRequest(ApiKeys.java:290) > /var/log/kafka/server.log.2019-12-26-00- at > org.apache.kafka.common.requests.RequestContext.parseRequest(RequestContext.java:63) > > In response to this, replica (10.22.2.74) had the following log in it > > [2019-12-26 00:13:04,390] WARN [ReplicaFetcher replicaId=1014, leaderId=1003, > fetcherId=0] Error in response for fetch request (type=FetchRequest, > replicaId=1014, maxWait=500, minBytes=1, maxBytes=10485760, > fetchData={_topic_name_=(offset=50344687, logStartOffset=24957467, > maxBytes=1048576)}, isolationLevel=READ_UNCOMMITTED, toForget=, > metadata=(sessionId=1747349875, epoch=183382033)) > (kafka.server.ReplicaFetcherThread) > java.io.IOException: Connection to 1003 was disconnected before the response > was read > at > org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97) > at > kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:96) > at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:240) > at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:43) > at > kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:149) > at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:114) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) > > Post this broker 1003 had the following exception > /var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:16:37,828] ERROR > [ReplicaFetcher replicaId=1003, leaderId=1014, fetcherId=0] Found invalid > messages during etch for partition _topic_name_ offset 91200983 > (kafka.server.ReplicaFetcherThread) > /var/log/kafka/server.log.2019-12-26-00-*org.apache.kafka.common.record.InvalidRecordException: > Record is corrupt (stored crc = 1460037823, computed crc = 114378201)* > /var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:16:40,690] ERROR > Closing socket for 10.84.198.238:6667-10.22.2.74:49850-740543 because of > error (kafka.network.Processor) > /var/log/kafka/server.log.2019-12-26-00-org.apache.kafka.common.errors.InvalidRequestException: > Error getting request for apiKey: FETCH, apiVersion: 8, connectionId: > 10.84.198.238:6667-10.22.2.74:49850-740543, listenerName: > ListenerName(PLAINTEXT), principal: User:ANONYMOUS > Could you help us with the above issue? > > -- This message wa
[jira] [Commented] (KAFKA-9351) Higher count in destination cluster using Kafka MM2
[ https://issues.apache.org/jira/browse/KAFKA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009035#comment-17009035 ] Ryanne Dolan commented on KAFKA-9351: - [~nitishgoyal13] are those event counts or offsets? MM2 cannot guarantee that downstream offsets or event counts match exactly -- it only offers at-least-once semantics. So you can be sure that records are not dropped, and they maintain approximately the same order, but there may be dupes due to retries in the producer. I'm working on a PoC for exactly-once semantics, but currently the Connect framework makes this difficult to get right. > Higher count in destination cluster using Kafka MM2 > --- > > Key: KAFKA-9351 > URL: https://issues.apache.org/jira/browse/KAFKA-9351 > Project: Kafka > Issue Type: Bug > Components: mirrormaker >Affects Versions: 2.4.0 >Reporter: Nitish Goyal >Priority: Blocker > > I have setup replication between cluster across different data centres. After > setting up replication, at times, I am observing higher event count in > destination cluster > Below are counts in source and destination cluster > > *Source Cluster* > ``` > > events_4:0:51048 > events_4:1:52250 > events_4:2:51526 > ``` > > *Destination Cluster* > ``` > nm5.events_4:0:53289 > nm5.events_4:1:54569 > nm5.events_4:2:53733 > ``` > > This is a blocker for us to start using MM2 replicatior -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9351) Higher count in destination cluster using Kafka MM2
[ https://issues.apache.org/jira/browse/KAFKA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryanne Dolan updated KAFKA-9351: Priority: Minor (was: Blocker) > Higher count in destination cluster using Kafka MM2 > --- > > Key: KAFKA-9351 > URL: https://issues.apache.org/jira/browse/KAFKA-9351 > Project: Kafka > Issue Type: Bug > Components: mirrormaker >Affects Versions: 2.4.0 >Reporter: Nitish Goyal >Priority: Minor > > I have setup replication between cluster across different data centres. After > setting up replication, at times, I am observing higher event count in > destination cluster > Below are counts in source and destination cluster > > *Source Cluster* > ``` > > events_4:0:51048 > events_4:1:52250 > events_4:2:51526 > ``` > > *Destination Cluster* > ``` > nm5.events_4:0:53289 > nm5.events_4:1:54569 > nm5.events_4:2:53733 > ``` > > This is a blocker for us to start using MM2 replicatior -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (KAFKA-9345) Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through REST API
[ https://issues.apache.org/jira/browse/KAFKA-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryanne Dolan closed KAFKA-9345. --- > Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through > REST API > --- > > Key: KAFKA-9345 > URL: https://issues.apache.org/jira/browse/KAFKA-9345 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect, mirrormaker >Affects Versions: 2.4.0 > Environment: runtime env: > source-cluster: kafka 2.2.1 > target-cluster: kafka 2.2.1 > Mirror Maker 2.0 : kafka 2.4.0 >Reporter: yzhou >Assignee: yzhou >Priority: Minor > > 1. Which is the best way to deploy mirror maker 2.0? (a dedicated mm2 > cluster or running mm2 in a connect cluster) . Could you tell me the > difference between them? > 2. According to the blog or wiki > ([https://blog.cloudera.com/a-look-inside-kafka-mirrormaker-2/] , > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-Config,ACLSync] > , > [https://github.com/apache/kafka/blob/cae2a5e1f0779a0889f6cb43b523ebc8a812f4c2/connect/mirror/README.md] > ). Mirror Maker 2.0 topic supports dynamic modification of the whielist, > but I cannot figure out how to make it. Could you tell me how to solve this > problem? > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9345) Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through REST API
[ https://issues.apache.org/jira/browse/KAFKA-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009040#comment-17009040 ] Ryanne Dolan commented on KAFKA-9345: - [~xinzhuxianshenger] no inconvenience. I'll close this ticket, as I don't believe it represents a bug. > Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through > REST API > --- > > Key: KAFKA-9345 > URL: https://issues.apache.org/jira/browse/KAFKA-9345 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect, mirrormaker >Affects Versions: 2.4.0 > Environment: runtime env: > source-cluster: kafka 2.2.1 > target-cluster: kafka 2.2.1 > Mirror Maker 2.0 : kafka 2.4.0 >Reporter: yzhou >Assignee: yzhou >Priority: Minor > > 1. Which is the best way to deploy mirror maker 2.0? (a dedicated mm2 > cluster or running mm2 in a connect cluster) . Could you tell me the > difference between them? > 2. According to the blog or wiki > ([https://blog.cloudera.com/a-look-inside-kafka-mirrormaker-2/] , > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-Config,ACLSync] > , > [https://github.com/apache/kafka/blob/cae2a5e1f0779a0889f6cb43b523ebc8a812f4c2/connect/mirror/README.md] > ). Mirror Maker 2.0 topic supports dynamic modification of the whielist, > but I cannot figure out how to make it. Could you tell me how to solve this > problem? > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-9345) Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through REST API
[ https://issues.apache.org/jira/browse/KAFKA-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryanne Dolan resolved KAFKA-9345. - Resolution: Information Provided > Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through > REST API > --- > > Key: KAFKA-9345 > URL: https://issues.apache.org/jira/browse/KAFKA-9345 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect, mirrormaker >Affects Versions: 2.4.0 > Environment: runtime env: > source-cluster: kafka 2.2.1 > target-cluster: kafka 2.2.1 > Mirror Maker 2.0 : kafka 2.4.0 >Reporter: yzhou >Assignee: yzhou >Priority: Minor > > 1. Which is the best way to deploy mirror maker 2.0? (a dedicated mm2 > cluster or running mm2 in a connect cluster) . Could you tell me the > difference between them? > 2. According to the blog or wiki > ([https://blog.cloudera.com/a-look-inside-kafka-mirrormaker-2/] , > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-Config,ACLSync] > , > [https://github.com/apache/kafka/blob/cae2a5e1f0779a0889f6cb43b523ebc8a812f4c2/connect/mirror/README.md] > ). Mirror Maker 2.0 topic supports dynamic modification of the whielist, > but I cannot figure out how to make it. Could you tell me how to solve this > problem? > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9364) Fix misleading consumer logs on throttling
[ https://issues.apache.org/jira/browse/KAFKA-9364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009086#comment-17009086 ] ASF GitHub Bot commented on KAFKA-9364: --- cmccabe commented on pull request #7894: KAFKA-9364: Fix misleading consumer logs on throttling URL: https://github.com/apache/kafka/pull/7894 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix misleading consumer logs on throttling > -- > > Key: KAFKA-9364 > URL: https://issues.apache.org/jira/browse/KAFKA-9364 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Assignee: Colin McCabe >Priority: Minor > > Fix misleading consumer logs on throttling -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9324) Drop support for Scala 2.11 (KIP-531)
[ https://issues.apache.org/jira/browse/KAFKA-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009094#comment-17009094 ] ASF GitHub Bot commented on KAFKA-9324: --- ijuma commented on pull request #7859: KAFKA-9324: Drop support for Scala 2.11 (KIP-531) URL: https://github.com/apache/kafka/pull/7859 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Drop support for Scala 2.11 (KIP-531) > - > > Key: KAFKA-9324 > URL: https://issues.apache.org/jira/browse/KAFKA-9324 > Project: Kafka > Issue Type: Improvement >Reporter: Ismael Juma >Assignee: Ismael Juma >Priority: Major > Labels: kip > Fix For: 2.5.0 > > > See > https://cwiki.apache.org/confluence/display/KAFKA/KIP-531%3A+Drop+support+for+Scala+2.11+in+Kafka+2.5 > for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9368) Preserve stream-time across rebalances/restarts
Sophie Blee-Goldman created KAFKA-9368: -- Summary: Preserve stream-time across rebalances/restarts Key: KAFKA-9368 URL: https://issues.apache.org/jira/browse/KAFKA-9368 Project: Kafka Issue Type: Bug Components: streams Reporter: Sophie Blee-Goldman Stream-time is used to make decisions about processing out-of-order records or drop them if they are late (ie, timestamp < stream-time - grace-period). This is currently tracked on a per-processor basis such that each node has its own local view of stream-time based on the maximum timestamp it has processed. During rebalances and restarts, stream-time is initialized as UNKNOWN (ie, -1) for all processors in tasks that are newly created (or migrated). In net effect, we forget current stream-time for this case what may lead to non-deterministic behavior if we stop processing right before a late record, that would be dropped if we continue processing, but is not dropped after rebalance/restart. Let's look at an examples with a grace period of 5ms for a tumbling windowed of 5ms, and the following records (timestamps in parenthesis): {code:java} r1(0) r2(5) r3(11) r4(2){code} In the example, stream-time advances as 0, 5, 11, 11 and thus record `r4` is dropped as late because 2 < 6 = 11 - 5. However, if we stop processing or rebalance after processing `r3` but before processing `r4`, we would reinitialize stream-time as -1, and thus would process `r4` on restart/after rebalance. The problem is, that stream-time does advance differently from a global point of view: 0, 5, 11, 2. Of course, this is a corner case because if we would stop processing one record earlier -- ie, after processing `r2` but before processing `r3` -- stream-time would be advanced correctly from a global point of view. Note that in previous versions the maximum partition-time was actually used for stream-time. This changed in 2.3 due to KAFKA-7895/[PR 6278|https://github.com/apache/kafka/pull/6278], and could potentially change yet again in future versions (c.f. KAFKA-8769). Partition-time actually is preserved as of 2.4 thanks to KAFKA-7994. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9335) java.lang.IllegalArgumentException: Number of partitions must be at least 1.
[ https://issues.apache.org/jira/browse/KAFKA-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apurva Mehta reassigned KAFKA-9335: --- Assignee: Boyang Chen > java.lang.IllegalArgumentException: Number of partitions must be at least 1. > > > Key: KAFKA-9335 > URL: https://issues.apache.org/jira/browse/KAFKA-9335 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.4.0 >Reporter: Nitay Kufert >Assignee: Boyang Chen >Priority: Major > Labels: bug > > Hey, > When trying to upgrade our Kafka streams client to 2.4.0 (from 2.3.1) we > encountered the following exception: > {code:java} > java.lang.IllegalArgumentException: Number of partitions must be at least 1. > {code} > It's important to notice that the exact same code works just fine at 2.3.1. > > I have created a "toy" example which reproduces this exception: > [https://gist.github.com/nitayk/50da33b7bcce19ad0a7f8244d309cb8f] > and I would love to get some insight regarding why its happening / ways to > get around it > > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-7994) Improve Stream-Time for rebalances and restarts
[ https://issues.apache.org/jira/browse/KAFKA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman updated KAFKA-7994: --- Description: We compute a per-partition partition-time as the maximum timestamp over all records processed so far. Before 2.3 this was used to determine the logical stream-time used to make decisions about processing out-of-order records or drop them if they are late (ie, timestamp < stream-time - grace-period). Preserving the stream-time is necessary to ensure deterministic results (see KAFKA-9368), and although the processor-time is now used instead of partition-time, preserving the partition-time is a first step towards improving the overall stream-time semantics. The partition-time is also used by the TimestampExtractor. It gets passed in to #extract and can be used to determine a rough timestamp estimate if the actual timestamp is missing, corrupt, etc. This means in the corner case where the next record to be processed after a rebalance/restart cannot have its actual timestamp determined, we have no idea way of coming up with a reasonable guess and the record will likely have to be dropped. A potential fix would be, to store latest observed partition-time in the metadata of committed offsets. This way, on restart/rebalance we can re-initialize partition-time correctly. was: We compute a per-partition partition-time as the maximum timestamp over all records processed so far. Furthermore, we use partition-time to compute stream-time for each task as maximum over all partition-times (for all corresponding task partitions). This stream-time is used to make decisions about processing out-of-order records or drop them if they are late (ie, timestamp < stream-time - grace-period). During rebalances and restarts, stream-time is initialized as UNKNOWN (ie, -1) for tasks that are newly created (or migrated). In net effect, we forget current stream-time for this case what may lead to non-deterministic behavior if we stop processing right before a late record, that would be dropped if we continue processing, but is not dropped after rebalance/restart. Let's look at an examples with a grade period of 5ms for a tumbling windowed of 5ms, and the following records (timestamps in parenthesis): {code:java} r1(0) r2(5) r3(11) r4(2){code} In the example, stream-time advances as 0, 5, 11, 11 and thus record `r4` is dropped as late because 2 < 6 = 11 - 5. However, if we stop processing or rebalance after processing `r3` but before processing `r4`, we would reinitialize stream-time as -1, and thus would process `r4` on restart/after rebalance. The problem is, that stream-time does advance differently from a global point of view: 0, 5, 11, 2. Note, this is a corner case, because if we would stop processing one record earlier, ie, after processing `r2` but before processing `r3`, stream-time would be advance correctly from a global point of view. A potential fix would be, to store latest observed partition-time in the metadata of committed offsets. Thus way, on restart/rebalance we can re-initialize time correctly. > Improve Stream-Time for rebalances and restarts > --- > > Key: KAFKA-7994 > URL: https://issues.apache.org/jira/browse/KAFKA-7994 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Matthias J. Sax >Assignee: Richard Yu >Priority: Major > Fix For: 2.4.0 > > Attachments: possible-patch.diff > > > We compute a per-partition partition-time as the maximum timestamp over all > records processed so far. Before 2.3 this was used to determine the logical > stream-time used to make decisions about processing out-of-order records or > drop them if they are late (ie, timestamp < stream-time - grace-period). > Preserving the stream-time is necessary to ensure deterministic results (see > KAFKA-9368), and although the processor-time is now used instead of > partition-time, preserving the partition-time is a first step towards > improving the overall stream-time semantics. > The partition-time is also used by the TimestampExtractor. It gets passed in > to #extract and can be used to determine a rough timestamp estimate if the > actual timestamp is missing, corrupt, etc. This means in the corner case > where the next record to be processed after a rebalance/restart cannot have > its actual timestamp determined, we have no idea way of coming up with a > reasonable guess and the record will likely have to be dropped. > > A potential fix would be, to store latest observed partition-time in the > metadata of committed offsets. This way, on restart/rebalance we can > re-initialize partition-time correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-7994) Improve Partition-Time for rebalances and restarts
[ https://issues.apache.org/jira/browse/KAFKA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman updated KAFKA-7994: --- Summary: Improve Partition-Time for rebalances and restarts (was: Improve Stream-Time for rebalances and restarts) > Improve Partition-Time for rebalances and restarts > -- > > Key: KAFKA-7994 > URL: https://issues.apache.org/jira/browse/KAFKA-7994 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Matthias J. Sax >Assignee: Richard Yu >Priority: Major > Fix For: 2.4.0 > > Attachments: possible-patch.diff > > > We compute a per-partition partition-time as the maximum timestamp over all > records processed so far. Before 2.3 this was used to determine the logical > stream-time used to make decisions about processing out-of-order records or > drop them if they are late (ie, timestamp < stream-time - grace-period). > Preserving the stream-time is necessary to ensure deterministic results (see > KAFKA-9368), and although the processor-time is now used instead of > partition-time, preserving the partition-time is a first step towards > improving the overall stream-time semantics. > The partition-time is also used by the TimestampExtractor. It gets passed in > to #extract and can be used to determine a rough timestamp estimate if the > actual timestamp is missing, corrupt, etc. This means in the corner case > where the next record to be processed after a rebalance/restart cannot have > its actual timestamp determined, we have no idea way of coming up with a > reasonable guess and the record will likely have to be dropped. > > A potential fix would be, to store latest observed partition-time in the > metadata of committed offsets. This way, on restart/rebalance we can > re-initialize partition-time correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9368) Preserve stream-time across rebalances/restarts
[ https://issues.apache.org/jira/browse/KAFKA-9368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009121#comment-17009121 ] Sophie Blee-Goldman commented on KAFKA-9368: In KAFKA-7994 we fixed the partition-time for rebalances/restarts, but since stream-time is now determined on a per-processor basis and not by the partition-time we should track the issue separately. Thus I split the "preserve stream-time" aspect of KAFKA-7994 to a new ticket (see original ticket for full discussion). > Preserve stream-time across rebalances/restarts > --- > > Key: KAFKA-9368 > URL: https://issues.apache.org/jira/browse/KAFKA-9368 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Sophie Blee-Goldman >Priority: Major > > Stream-time is used to make decisions about processing out-of-order records > or drop them if they are late (ie, timestamp < stream-time - grace-period). > This is currently tracked on a per-processor basis such that each node has > its own local view of stream-time based on the maximum timestamp it has > processed. > During rebalances and restarts, stream-time is initialized as UNKNOWN (ie, > -1) for all processors in tasks that are newly created (or migrated). In net > effect, we forget current stream-time for this case what may lead to > non-deterministic behavior if we stop processing right before a late record, > that would be dropped if we continue processing, but is not dropped after > rebalance/restart. Let's look at an examples with a grace period of 5ms for a > tumbling windowed of 5ms, and the following records (timestamps in > parenthesis): > {code:java} > r1(0) r2(5) r3(11) r4(2){code} > In the example, stream-time advances as 0, 5, 11, 11 and thus record `r4` is > dropped as late because 2 < 6 = 11 - 5. However, if we stop processing or > rebalance after processing `r3` but before processing `r4`, we would > reinitialize stream-time as -1, and thus would process `r4` on restart/after > rebalance. The problem is, that stream-time does advance differently from a > global point of view: 0, 5, 11, 2. > Of course, this is a corner case because if we would stop processing one > record earlier -- ie, after processing `r2` but before processing `r3` -- > stream-time would be advanced correctly from a global point of view. > Note that in previous versions the maximum partition-time was actually used > for stream-time. This changed in 2.3 due to KAFKA-7895/[PR > 6278|https://github.com/apache/kafka/pull/6278], and could potentially change > yet again in future versions (c.f. KAFKA-8769). Partition-time actually is > preserved as of 2.4 thanks to KAFKA-7994. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7994) Improve Partition-Time for rebalances and restarts
[ https://issues.apache.org/jira/browse/KAFKA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009128#comment-17009128 ] Sophie Blee-Goldman commented on KAFKA-7994: Since we fixed part of this issue but not the full scope since partition-time is no longer used to determine stream-time, I've updated the description to cover only the preservation of partition-time (which was fixed for 2.4). The remaining work w.r.t preserving stream-time was broken out into a new ticket so we can track that separately. See KAFKA-9368 > Improve Partition-Time for rebalances and restarts > -- > > Key: KAFKA-7994 > URL: https://issues.apache.org/jira/browse/KAFKA-7994 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Matthias J. Sax >Assignee: Richard Yu >Priority: Major > Fix For: 2.4.0 > > Attachments: possible-patch.diff > > > We compute a per-partition partition-time as the maximum timestamp over all > records processed so far. Before 2.3 this was used to determine the logical > stream-time used to make decisions about processing out-of-order records or > drop them if they are late (ie, timestamp < stream-time - grace-period). > Preserving the stream-time is necessary to ensure deterministic results (see > KAFKA-9368), and although the processor-time is now used instead of > partition-time, preserving the partition-time is a first step towards > improving the overall stream-time semantics. > The partition-time is also used by the TimestampExtractor. It gets passed in > to #extract and can be used to determine a rough timestamp estimate if the > actual timestamp is missing, corrupt, etc. This means in the corner case > where the next record to be processed after a rebalance/restart cannot have > its actual timestamp determined, we have no idea way of coming up with a > reasonable guess and the record will likely have to be dropped. > > A potential fix would be, to store latest observed partition-time in the > metadata of committed offsets. This way, on restart/rebalance we can > re-initialize partition-time correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9369) Allow Consumers and Producers to Connect with User-Agent String
David Mollitor created KAFKA-9369: - Summary: Allow Consumers and Producers to Connect with User-Agent String Key: KAFKA-9369 URL: https://issues.apache.org/jira/browse/KAFKA-9369 Project: Kafka Issue Type: Improvement Reporter: David Mollitor Given the adhoc nature of consumers and producers in Kafka, it can be difficult to track where connections to brokers and partitions are coming from. Please allow consumers and producers to pass an optional _user-agent_ string during the connection process so that they can quickly and accurately be identified. For example, if I am performing an upgrade on my consumers, I want to be able to see that no consumers with an older version number of the consuming software still exist or if I see an application that is configured to consumer from the wrong consumer group, they can quickly be identified and removed. [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9369) Allow Consumers and Producers to Connect with User-Agent String
[ https://issues.apache.org/jira/browse/KAFKA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009172#comment-17009172 ] Andrew Otto commented on KAFKA-9369: You could use the client.id configuration property when you instantiate your client [https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-Requests] [https://groups.google.com/forum/#!topic/kafka-clients/O2POKeq1EUE] > Allow Consumers and Producers to Connect with User-Agent String > --- > > Key: KAFKA-9369 > URL: https://issues.apache.org/jira/browse/KAFKA-9369 > Project: Kafka > Issue Type: Improvement >Reporter: David Mollitor >Priority: Minor > > Given the adhoc nature of consumers and producers in Kafka, it can be > difficult to track where connections to brokers and partitions are coming > from. > > Please allow consumers and producers to pass an optional _user-agent_ string > during the connection process so that they can quickly and accurately be > identified. For example, if I am performing an upgrade on my consumers, I > want to be able to see that no consumers with an older version number of the > consuming software still exist or if I see an application that is configured > to consumer from the wrong consumer group, they can quickly be identified and > removed. > > [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
Vikas Singh created KAFKA-9370: -- Summary: Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress Key: KAFKA-9370 URL: https://issues.apache.org/jira/browse/KAFKA-9370 Project: Kafka Issue Type: Bug Reporter: Vikas Singh `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` if the topic is getting deleted. Change it to return `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, client should see the topic as deleted. The fact that we are processing deletion in background shouldn't have any impact. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
[ https://issues.apache.org/jira/browse/KAFKA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-9370: -- Assignee: Vikas Singh > Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress > -- > > Key: KAFKA-9370 > URL: https://issues.apache.org/jira/browse/KAFKA-9370 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` > if the topic is getting deleted. Change it to return > `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, > client should see the topic as deleted. The fact that we are processing > deletion in background shouldn't have any impact. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9369) Allow Consumers and Producers to Connect with User-Agent String
[ https://issues.apache.org/jira/browse/KAFKA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009192#comment-17009192 ] David Mollitor commented on KAFKA-9369: --- [~ottomata] Thanks for that. I am familiar with that configuration. However, there is no corollary for the producers. [https://jaceklaskowski.gitbooks.io/apache-kafka/kafka-properties-client-id.html] > Allow Consumers and Producers to Connect with User-Agent String > --- > > Key: KAFKA-9369 > URL: https://issues.apache.org/jira/browse/KAFKA-9369 > Project: Kafka > Issue Type: Improvement >Reporter: David Mollitor >Priority: Minor > > Given the adhoc nature of consumers and producers in Kafka, it can be > difficult to track where connections to brokers and partitions are coming > from. > > Please allow consumers and producers to pass an optional _user-agent_ string > during the connection process so that they can quickly and accurately be > identified. For example, if I am performing an upgrade on my consumers, I > want to be able to see that no consumers with an older version number of the > consuming software still exist or if I see an application that is configured > to consumer from the wrong consumer group, they can quickly be identified and > removed. > > [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9348) Broker shutdown during controller initialization can lead to zombie broker
[ https://issues.apache.org/jira/browse/KAFKA-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009229#comment-17009229 ] David Arthur commented on KAFKA-9348: - I talked with [~hachikuji] about this and after looking through the code, it seems like this shouldn't be able to happen since things are serialized through the event queue in the Controller. We'll leave this issue open in case we see this again and can capture some logs. > Broker shutdown during controller initialization can lead to zombie broker > -- > > Key: KAFKA-9348 > URL: https://issues.apache.org/jira/browse/KAFKA-9348 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > It is possible that a broker may be shutdown while it is in the process of > becoming the controller. There is no protection currently to ensure that > initialization doesn't interfere with shutdown. An example of this is the > shutdown of the controller channel manager. It is possible that the request > send threads are restarted by the initialization logic _after_ the shutdown > method has returned. In this case, there will be no call to > `initiateShutdown` on any newly created send threads which will leave the > shutdown hook hanging. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9335) java.lang.IllegalArgumentException: Number of partitions must be at least 1.
[ https://issues.apache.org/jira/browse/KAFKA-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boyang Chen updated KAFKA-9335: --- Priority: Blocker (was: Major) > java.lang.IllegalArgumentException: Number of partitions must be at least 1. > > > Key: KAFKA-9335 > URL: https://issues.apache.org/jira/browse/KAFKA-9335 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.4.0 >Reporter: Nitay Kufert >Assignee: Boyang Chen >Priority: Blocker > Labels: bug > > Hey, > When trying to upgrade our Kafka streams client to 2.4.0 (from 2.3.1) we > encountered the following exception: > {code:java} > java.lang.IllegalArgumentException: Number of partitions must be at least 1. > {code} > It's important to notice that the exact same code works just fine at 2.3.1. > > I have created a "toy" example which reproduces this exception: > [https://gist.github.com/nitayk/50da33b7bcce19ad0a7f8244d309cb8f] > and I would love to get some insight regarding why its happening / ways to > get around it > > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8764) LogCleanerManager endless loop while compacting/cleaning segments
[ https://issues.apache.org/jira/browse/KAFKA-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009262#comment-17009262 ] Jeff Nadler commented on KAFKA-8764: I wanted to throw my .02 in here: We are seeing the same issue, also with compact topics. The compact topics in question are quite low traffic. LogCleaner is running flat-out, with only a few ms between attempts to clean a single topic-partition. The log cleaner thread is consuming almost an entire core: {code:java} 14857 kafka 20 0 8894844 2.187g 12396 R 87.5 56.8 47:04.92 kafka-log-clean {code} We're running 2.4.0, openjdk11, happy to provide any add'l info to help with this issue. > LogCleanerManager endless loop while compacting/cleaning segments > - > > Key: KAFKA-8764 > URL: https://issues.apache.org/jira/browse/KAFKA-8764 > Project: Kafka > Issue Type: Bug > Components: log cleaner >Affects Versions: 2.3.0, 2.2.1 > Environment: docker base image: openjdk:8-jre-alpine base image, > kafka from http://ftp.carnet.hr/misc/apache/kafka/2.2.1/kafka_2.12-2.2.1.tgz >Reporter: Tomislav Rajakovic >Priority: Major > Attachments: log-cleaner-bug-reproduction.zip > > > {{LogCleanerManager stuck in endless loop while clearing segments for one > partition resulting with many log outputs and heavy disk read/writes/IOPS.}} > > Issue appeared on follower brokers, and it happens on every (new) broker if > partition assignment is changed. > > Original issue setup: > * kafka_2.12-2.2.1 deployed as statefulset on kubernetes, 5 brokers > * log directory is (AWS) EBS mounted PV, gp2 (ssd) kind of 750GiB > * 5 zookeepers > * topic created with config: > ** name = "backup_br_domain_squad" > partitions = 36 > replication_factor = 3 > config = { > "cleanup.policy" = "compact" > "min.compaction.lag.ms" = "8640" > "min.cleanable.dirty.ratio" = "0.3" > } > > > Log excerpt: > {{[2019-08-07 12:10:53,895] INFO [Log partition=backup_br_domain_squad-14, > dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}} > {{[2019-08-07 12:10:53,895] INFO Deleted log > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.log.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:53,896] INFO Deleted offset index > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.index.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:53,896] INFO Deleted time index > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.timeindex.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:53,964] INFO [Log partition=backup_br_domain_squad-14, > dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}} > {{[2019-08-07 12:10:53,964] INFO Deleted log > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.log.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:53,964] INFO Deleted offset index > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.index.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:53,964] INFO Deleted time index > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.timeindex.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:54,031] INFO [Log partition=backup_br_domain_squad-14, > dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}} > {{[2019-08-07 12:10:54,032] INFO Deleted log > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.log.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:54,032] INFO Deleted offset index > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.index.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:54,032] INFO Deleted time index > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.timeindex.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:54,101] INFO [Log partition=backup_br_domain_squad-14, > dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}} > {{[2019-08-07 12:10:54,101] INFO Deleted log > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.log.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:54,101] INFO Deleted offset index > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.index.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:54,101] INFO Deleted time index > /var/lib/kafka/data/topics/backup_br_domain_squad-14/.timeindex.deleted. > (kafka.log.LogSegment)}} > {{[2019-08-07 12:10:54,173] INFO [Log partition=backup_br_domain_squad-14, > dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}} > {{[2019-08-07 12:10:54,173] I
[jira] [Created] (KAFKA-9371) Disk space is not released after Kafka clears data due to retention settings
Arda Savran created KAFKA-9371: -- Summary: Disk space is not released after Kafka clears data due to retention settings Key: KAFKA-9371 URL: https://issues.apache.org/jira/browse/KAFKA-9371 Project: Kafka Issue Type: Bug Components: core, streams Affects Versions: 2.2.0 Environment: CentOS 7.7 Reporter: Arda Savran We defined retention time on topics for 15 minutes. It looks like Kafka is deleting the messages as configured however the disk space is not restored. "df" output shows 30G for kafka-logs instead of the real size which is supposed to be 1Gb. {{/usr/sbin/lsof | grep deleted }} output shows a bunch of files under kafka-logs that are deleted but they are still consuming space. Is this a known issue? Is there a setting that I can apply to kafka broker server? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9372) Add producer config to make topicExpiry configurable
Jiao Zhang created KAFKA-9372: - Summary: Add producer config to make topicExpiry configurable Key: KAFKA-9372 URL: https://issues.apache.org/jira/browse/KAFKA-9372 Project: Kafka Issue Type: Improvement Components: producer Affects Versions: 1.1.0 Reporter: Jiao Zhang Sometimes we got error "org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 1000 ms" on producer side. We did the investigation and found # our producer produced messages in really low rate, the interval is more than 10 minutes # by default, producer would expire topics after TOPIC_EXPIRY_MS, after topic expired if no data produce before next metadata update (automatically triggered by metadata.max.age.ms) partitions entry for the topic would disappear from the Metadata cache As a result, almost for every time's produce, producer need fetch metadata which could possibly end with timeout. To solve this, we propose to add a new config metadata.topic.expiry for producer to make topicExpiry configurable. Topic expiry is good only when producer is long-lived and is used for producing variable counts of topics. But in the case that producers are bounded to single or few fixed topics, there is no need to expire topics at all. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9373) Improve shutdown performance via lazy accessing the offset and time indices.
Adem Efe Gencer created KAFKA-9373: -- Summary: Improve shutdown performance via lazy accessing the offset and time indices. Key: KAFKA-9373 URL: https://issues.apache.org/jira/browse/KAFKA-9373 Project: Kafka Issue Type: Bug Components: log Affects Versions: 2.3.1, 2.4.0, 2.3.0 Reporter: Adem Efe Gencer Assignee: Adem Efe Gencer Fix For: 2.3.1, 2.4.0, 2.3.0 KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they need to be closed or not. Ideally we should: * Improve shutdown performance via lazy accessing the offset and time indices. * Eliminate redundant disk accesses and memory mapped operations while deleting or renaming files that back segment indices. * Prevent illegal accesses to underlying indices of a closed segment, which would lead to memory leaks due to recreation of the underlying memory mapped objects. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9373) Improve shutdown performance via lazy accessing the offset and time indices.
[ https://issues.apache.org/jira/browse/KAFKA-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009314#comment-17009314 ] ASF GitHub Bot commented on KAFKA-9373: --- efeg commented on pull request #7900: KAFKA-9373: Improve shutdown performance via lazy accessing the offset and time indices URL: https://github.com/apache/kafka/pull/7900 KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand rather than performing costly disk/memory operations when creating all indices on broker startup. This helped reducing the startup time of brokers. However, segment indices are still created on closing segments, regardless of whether they need to be closed or not. This patch: * Improves shutdown performance via lazy accessing the offset and time indices. * Eliminates redundant disk accesses and memory mapped operations while deleting or renaming files that back segment indices. * Prevents illegal accesses to underlying indices of a closed segment, which would lead to memory leaks due to recreation of the underlying memory mapped objects. In our evaluations in a cluster with 31 brokers, where each broker has 13K to 20K segments, we observed up to 2 orders of magnitude faster LogManager shutdown times with this patch -- i.e. dropping the LogManager shutdown time of each broker from 10s of seconds to 100s of milliseconds. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve shutdown performance via lazy accessing the offset and time indices. > > > Key: KAFKA-9373 > URL: https://issues.apache.org/jira/browse/KAFKA-9373 > Project: Kafka > Issue Type: Bug > Components: log >Affects Versions: 2.3.0, 2.4.0, 2.3.1 >Reporter: Adem Efe Gencer >Assignee: Adem Efe Gencer >Priority: Major > Fix For: 2.3.0, 2.4.0, 2.3.1 > > > KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand > rather than performing costly disk/memory operations when creating all > indices on broker startup. This helped reducing the startup time of brokers. > However, segment indices are still created on closing segments, regardless of > whether they need to be closed or not. > > Ideally we should: > * Improve shutdown performance via lazy accessing the offset and time > indices. > * Eliminate redundant disk accesses and memory mapped operations while > deleting or renaming files that back segment indices. > * Prevent illegal accesses to underlying indices of a closed segment, which > would lead to memory leaks due to recreation of the underlying memory mapped > objects. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9351) Higher count in destination cluster using Kafka MM2
[ https://issues.apache.org/jira/browse/KAFKA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009365#comment-17009365 ] Nitish Goyal commented on KAFKA-9351: - [~ryannedolan] These are event counts At times, I only see a small difference in events count (which is well explained by retries in producer). But few times, I see a huge difference in events counts (which I am worried about) Eg : A partition with few million events would have difference of few hundred thousands records Is this an expected behavior? Can producer retries can cause difference of few hundred thousands records in destination cluster? > Higher count in destination cluster using Kafka MM2 > --- > > Key: KAFKA-9351 > URL: https://issues.apache.org/jira/browse/KAFKA-9351 > Project: Kafka > Issue Type: Bug > Components: mirrormaker >Affects Versions: 2.4.0 >Reporter: Nitish Goyal >Priority: Minor > > I have setup replication between cluster across different data centres. After > setting up replication, at times, I am observing higher event count in > destination cluster > Below are counts in source and destination cluster > > *Source Cluster* > ``` > > events_4:0:51048 > events_4:1:52250 > events_4:2:51526 > ``` > > *Destination Cluster* > ``` > nm5.events_4:0:53289 > nm5.events_4:1:54569 > nm5.events_4:2:53733 > ``` > > This is a blocker for us to start using MM2 replicatior -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9369) Allow Consumers and Producers to Connect with User-Agent String
[ https://issues.apache.org/jira/browse/KAFKA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009387#comment-17009387 ] Nikolay Izhikov commented on KAFKA-9369: [~belugabehr] You can find documentation for the "client.id" for producers here - http://kafka.apache.org/documentation.html#producerconfigs https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java#L128 - option in source code. > Allow Consumers and Producers to Connect with User-Agent String > --- > > Key: KAFKA-9369 > URL: https://issues.apache.org/jira/browse/KAFKA-9369 > Project: Kafka > Issue Type: Improvement >Reporter: David Mollitor >Priority: Minor > > Given the adhoc nature of consumers and producers in Kafka, it can be > difficult to track where connections to brokers and partitions are coming > from. > > Please allow consumers and producers to pass an optional _user-agent_ string > during the connection process so that they can quickly and accurately be > identified. For example, if I am performing an upgrade on my consumers, I > want to be able to see that no consumers with an older version number of the > consuming software still exist or if I see an application that is configured > to consumer from the wrong consumer group, they can quickly be identified and > removed. > > [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9374) Worker can be disabled by blocked connectors
Chris Egerton created KAFKA-9374: Summary: Worker can be disabled by blocked connectors Key: KAFKA-9374 URL: https://issues.apache.org/jira/browse/KAFKA-9374 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 2.3.1, 2.4.0, 2.2.2, 2.2.1, 2.3.0, 2.1.1, 2.2.0, 2.1.0, 2.0.1, 2.0.0, 1.1.1, 1.1.0, 1.0.2, 1.0.1, 1.0.0 Reporter: Chris Egerton Assignee: Chris Egerton If a connector hangs during any of its {{initialize}}, {{start}}, {{stop}}, \{taskConfigs}}, {{taskClass}}, {{version}}, {{config}}, or {{validate}} methods, the worker will be disabled for some types of requests thereafter, including connector creation, connector reconfiguration, and connector deletion. This only occurs in distributed mode and is due to the threading model used by the [DistributedHerder|https://github.com/apache/kafka/blob/03f763df8a8d9482d8c099806336f00cf2521465/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java] class. One potential solution could be to treat connectors that fail to start, stop, etc. in time similarly to tasks that fail to stop within the [task graceful shutdown timeout period|https://github.com/apache/kafka/blob/03f763df8a8d9482d8c099806336f00cf2521465/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConfig.java#L121-L126] by handling all connector interactions on a separate thread, waiting for them to complete within a timeout, and abandoning the thread (and transitioning the connector to the {{FAILED}} state, if it has been created at all) if that timeout expires. -- This message was sent by Atlassian Jira (v8.3.4#803005)