[jira] [Commented] (KAFKA-7737) Consolidate InitProducerId API

2020-01-06 Thread Viktor Somogyi-Vass (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008780#comment-17008780
 ] 

Viktor Somogyi-Vass commented on KAFKA-7737:


[~hachikuji] sure, and I'll be happy to help in the reviews once your PR is 
ready for it!

> Consolidate InitProducerId API
> --
>
> Key: KAFKA-7737
> URL: https://issues.apache.org/jira/browse/KAFKA-7737
> Project: Kafka
>  Issue Type: Task
>  Components: producer 
>Reporter: Viktor Somogyi-Vass
>Assignee: Viktor Somogyi-Vass
>Priority: Minor
>  Labels: exactly-once
>
> We have two separate paths in the producer for the InitProducerId API: one 
> for the transactional producer and one for the idempotent producer. It would 
> be nice to find a way to consolidate these.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-7737) Consolidate InitProducerId API

2020-01-06 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass reassigned KAFKA-7737:
--

Assignee: Jason Gustafson  (was: Viktor Somogyi-Vass)

> Consolidate InitProducerId API
> --
>
> Key: KAFKA-7737
> URL: https://issues.apache.org/jira/browse/KAFKA-7737
> Project: Kafka
>  Issue Type: Task
>  Components: producer 
>Reporter: Viktor Somogyi-Vass
>Assignee: Jason Gustafson
>Priority: Minor
>  Labels: exactly-once
>
> We have two separate paths in the producer for the InitProducerId API: one 
> for the transactional producer and one for the idempotent producer. It would 
> be nice to find a way to consolidate these.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9366) please consider upgrade log4j to log4j2 due to critical security problem CVE-2019-17571

2020-01-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008938#comment-17008938
 ] 

ASF GitHub Bot commented on KAFKA-9366:
---

dongjinleekr commented on pull request #7898: KAFKA-9366: please consider 
upgrade log4j to log4j2 due to critical security problem CVE-2019-17571
URL: https://github.com/apache/kafka/pull/7898
 
 
   This PR changes log4j dependency into log4j2.
   
   log4j migrated into log4j2 after its last release of 1.2.17 (May 2012), 
which is affected by this problem. So, the only way to fix it is by moving 
log4j dependency into log4j2.
   
   The problem is: the API for setting log level dynamically is different 
between log4j and log4j2. So, this PR also updates how `Log4jController` works.
   
   This PR also fixes a potential problem in `Log4jController#getLogLevel` - 
what if the root logger's level is null? It may result in 
`NullPointerException`.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> please consider upgrade log4j to log4j2 due to critical security problem 
> CVE-2019-17571
> ---
>
> Key: KAFKA-9366
> URL: https://issues.apache.org/jira/browse/KAFKA-9366
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.2.0, 2.1.1, 2.3.0, 2.4.0
>Reporter: leibo
>Priority: Critical
>
> h2. CVE-2019-17571 Detail
> Included in Log4j 1.2 is a SocketServer class that is vulnerable to 
> deserialization of untrusted data which can be exploited to remotely execute 
> arbitrary code when combined with a deserialization gadget when listening to 
> untrusted network traffic for log data. This affects Log4j versions up to 1.2 
> up to 1.2.17.
>  
> [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17571]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9366) please consider upgrade log4j to log4j2 due to critical security problem CVE-2019-17571

2020-01-06 Thread Dongjin Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjin Lee reassigned KAFKA-9366:
--

Assignee: Dongjin Lee

> please consider upgrade log4j to log4j2 due to critical security problem 
> CVE-2019-17571
> ---
>
> Key: KAFKA-9366
> URL: https://issues.apache.org/jira/browse/KAFKA-9366
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.2.0, 2.1.1, 2.3.0, 2.4.0
>Reporter: leibo
>Assignee: Dongjin Lee
>Priority: Critical
>
> h2. CVE-2019-17571 Detail
> Included in Log4j 1.2 is a SocketServer class that is vulnerable to 
> deserialization of untrusted data which can be exploited to remotely execute 
> arbitrary code when combined with a deserialization gadget when listening to 
> untrusted network traffic for log data. This affects Log4j versions up to 1.2 
> up to 1.2.17.
>  
> [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17571]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9367) CRC failure

2020-01-06 Thread Shivangi Singh (Jira)
Shivangi Singh created KAFKA-9367:
-

 Summary: CRC failure 
 Key: KAFKA-9367
 URL: https://issues.apache.org/jira/browse/KAFKA-9367
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.0.0
Reporter: Shivangi Singh


We have a 14 node kafka(2.0.0) cluster 

In our case 

*Leader* : *Broker Id* : 1003 *Ip*: 10.84.198.238
*Replica* : *Broker Id* : 1014 *Ip*: 10.22.2.74

A request was sent from replica -> leader to which leader(10.84.198.238) had 
the following exception


var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:13:04,386] ERROR Closing 
socket for 10.84.198.238:6667-10.22.2.74:53118-121025 because of error 
(kafka.network.Processor)
/var/log/kafka/server.log.2019-12-26-00-org.apache.kafka.common.errors.InvalidRequestException:
 Error getting request for apiKey: FETCH, apiVersion: 8, connectionId: 
10.84.198.238:6667-10.22.2.74:53118-121025, listenerName: 
ListenerName(PLAINTEXT), principal: User:ANONYMOUS
/var/log/kafka/server.log.2019-12-26-00-Caused by: 
org.apache.kafka.common.protocol.types.SchemaException: *Error reading field 
'forgotten_topics_data':* Error reading array of size 23668, only 69 bytes 
available
/var/log/kafka/server.log.2019-12-26-00- at 
org.apache.kafka.common.protocol.types.Schema.read(Schema.java:77)
/var/log/kafka/server.log.2019-12-26-00- at 
org.apache.kafka.common.protocol.ApiKeys.parseRequest(ApiKeys.java:290)
/var/log/kafka/server.log.2019-12-26-00- at 
org.apache.kafka.common.requests.RequestContext.parseRequest(RequestContext.java:63)



In response to this, replica (10.22.2.74) had the following log in it 

 

[2019-12-26 00:13:04,390] WARN [ReplicaFetcher replicaId=1014, leaderId=1003, 
fetcherId=0] Error in response for fetch request (type=FetchRequest, 
replicaId=1014, maxWait=500, minBytes=1, maxBytes=10485760, 
fetchData={_topic_name_=(offset=50344687, logStartOffset=24957467, 
maxBytes=1048576)}, isolationLevel=READ_UNCOMMITTED, toForget=, 
metadata=(sessionId=1747349875, epoch=183382033)) 
(kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to 1003 was disconnected before the response 
was read
at 
org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97)
at 
kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:96)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:240)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:43)
at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:149)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:114)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)



Post this broker 1003  had the following exception


/var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:16:37,828] ERROR 
[ReplicaFetcher replicaId=1003, leaderId=1014, fetcherId=0] Found invalid 
messages during etch for partition _topic_name_ offset 91200983 
(kafka.server.ReplicaFetcherThread)
/var/log/kafka/server.log.2019-12-26-00-*org.apache.kafka.common.record.InvalidRecordException:
 Record is corrupt (stored crc = 1460037823, computed crc = 114378201)*
/var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:16:40,690] ERROR Closing 
socket for 10.84.198.238:6667-10.22.2.74:49850-740543 because of error 
(kafka.network.Processor)
/var/log/kafka/server.log.2019-12-26-00-org.apache.kafka.common.errors.InvalidRequestException:
 Error getting request for apiKey: FETCH, apiVersion: 8, connectionId: 
10.84.198.238:6667-10.22.2.74:49850-740543, listenerName: 
ListenerName(PLAINTEXT), principal: User:ANONYMOUS


Could you help us with the above issue?

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9367) CRC failure

2020-01-06 Thread Shivangi Singh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009032#comment-17009032
 ] 

Shivangi Singh commented on KAFKA-9367:
---

We are also witnessing following exceptions 

 

[2020-01-06 21:42:12,375] ERROR [ReplicaFetcher replicaId=1009, leaderId=1006, 
fetcherId=0] Found invalid messages during fetch for partition _topic_name_ 
offset 909070901 (kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.errors.CorruptRecordException: Invalid magic found in 
record: 117
[2020-01-06 21:42:13,417] ERROR [ReplicaFetcher replicaId=1009, leaderId=1006, 
fetcherId=0] Found invalid messages during fetch for partition 
payments_payment_data_flow_instances-26 offset 909070901 
(kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.record.InvalidRecordException: Record is corrupt 
(stored crc = 3006481486, computed crc = 2816728803)

> CRC failure 
> 
>
> Key: KAFKA-9367
> URL: https://issues.apache.org/jira/browse/KAFKA-9367
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Shivangi Singh
>Priority: Major
>
> We have a 14 node kafka(2.0.0) cluster 
> In our case 
> *Leader* : *Broker Id* : 1003 *Ip*: 10.84.198.238
> *Replica* : *Broker Id* : 1014 *Ip*: 10.22.2.74
> A request was sent from replica -> leader to which leader(10.84.198.238) had 
> the following exception
> var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:13:04,386] ERROR 
> Closing socket for 10.84.198.238:6667-10.22.2.74:53118-121025 because of 
> error (kafka.network.Processor)
> /var/log/kafka/server.log.2019-12-26-00-org.apache.kafka.common.errors.InvalidRequestException:
>  Error getting request for apiKey: FETCH, apiVersion: 8, connectionId: 
> 10.84.198.238:6667-10.22.2.74:53118-121025, listenerName: 
> ListenerName(PLAINTEXT), principal: User:ANONYMOUS
> /var/log/kafka/server.log.2019-12-26-00-Caused by: 
> org.apache.kafka.common.protocol.types.SchemaException: *Error reading field 
> 'forgotten_topics_data':* Error reading array of size 23668, only 69 bytes 
> available
> /var/log/kafka/server.log.2019-12-26-00- at 
> org.apache.kafka.common.protocol.types.Schema.read(Schema.java:77)
> /var/log/kafka/server.log.2019-12-26-00- at 
> org.apache.kafka.common.protocol.ApiKeys.parseRequest(ApiKeys.java:290)
> /var/log/kafka/server.log.2019-12-26-00- at 
> org.apache.kafka.common.requests.RequestContext.parseRequest(RequestContext.java:63)
> 
> In response to this, replica (10.22.2.74) had the following log in it 
>  
> [2019-12-26 00:13:04,390] WARN [ReplicaFetcher replicaId=1014, leaderId=1003, 
> fetcherId=0] Error in response for fetch request (type=FetchRequest, 
> replicaId=1014, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={_topic_name_=(offset=50344687, logStartOffset=24957467, 
> maxBytes=1048576)}, isolationLevel=READ_UNCOMMITTED, toForget=, 
> metadata=(sessionId=1747349875, epoch=183382033)) 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 1003 was disconnected before the response 
> was read
> at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97)
> at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:96)
> at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:240)
> at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:43)
> at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:149)
> at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:114)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> 
> Post this broker 1003  had the following exception
> /var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:16:37,828] ERROR 
> [ReplicaFetcher replicaId=1003, leaderId=1014, fetcherId=0] Found invalid 
> messages during etch for partition _topic_name_ offset 91200983 
> (kafka.server.ReplicaFetcherThread)
> /var/log/kafka/server.log.2019-12-26-00-*org.apache.kafka.common.record.InvalidRecordException:
>  Record is corrupt (stored crc = 1460037823, computed crc = 114378201)*
> /var/log/kafka/server.log.2019-12-26-00:[2019-12-26 00:16:40,690] ERROR 
> Closing socket for 10.84.198.238:6667-10.22.2.74:49850-740543 because of 
> error (kafka.network.Processor)
> /var/log/kafka/server.log.2019-12-26-00-org.apache.kafka.common.errors.InvalidRequestException:
>  Error getting request for apiKey: FETCH, apiVersion: 8, connectionId: 
> 10.84.198.238:6667-10.22.2.74:49850-740543, listenerName: 
> ListenerName(PLAINTEXT), principal: User:ANONYMOUS
> Could you help us with the above issue?
>  
>  



--
This message wa

[jira] [Commented] (KAFKA-9351) Higher count in destination cluster using Kafka MM2

2020-01-06 Thread Ryanne Dolan (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009035#comment-17009035
 ] 

Ryanne Dolan commented on KAFKA-9351:
-

[~nitishgoyal13] are those event counts or offsets? MM2 cannot guarantee that 
downstream offsets or event counts match exactly -- it only offers 
at-least-once semantics. So you can be sure that records are not dropped, and 
they maintain approximately the same order, but there may be dupes due to 
retries in the producer.

I'm working on a PoC for exactly-once semantics, but currently the Connect 
framework makes this difficult to get right.

> Higher count in destination cluster using Kafka MM2
> ---
>
> Key: KAFKA-9351
> URL: https://issues.apache.org/jira/browse/KAFKA-9351
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Affects Versions: 2.4.0
>Reporter: Nitish Goyal
>Priority: Blocker
>
> I have setup replication between cluster across different data centres. After 
> setting up replication, at times, I am observing higher event count in 
> destination cluster
> Below are counts in source and destination cluster
>  
> *Source Cluster*
> ```
>  
> events_4:0:51048
> events_4:1:52250
> events_4:2:51526
> ```
>  
> *Destination Cluster*
> ```
> nm5.events_4:0:53289
> nm5.events_4:1:54569
> nm5.events_4:2:53733
> ```
>  
> This is a blocker for us to start using MM2 replicatior



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9351) Higher count in destination cluster using Kafka MM2

2020-01-06 Thread Ryanne Dolan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryanne Dolan updated KAFKA-9351:

Priority: Minor  (was: Blocker)

> Higher count in destination cluster using Kafka MM2
> ---
>
> Key: KAFKA-9351
> URL: https://issues.apache.org/jira/browse/KAFKA-9351
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Affects Versions: 2.4.0
>Reporter: Nitish Goyal
>Priority: Minor
>
> I have setup replication between cluster across different data centres. After 
> setting up replication, at times, I am observing higher event count in 
> destination cluster
> Below are counts in source and destination cluster
>  
> *Source Cluster*
> ```
>  
> events_4:0:51048
> events_4:1:52250
> events_4:2:51526
> ```
>  
> *Destination Cluster*
> ```
> nm5.events_4:0:53289
> nm5.events_4:1:54569
> nm5.events_4:2:53733
> ```
>  
> This is a blocker for us to start using MM2 replicatior



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (KAFKA-9345) Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through REST API

2020-01-06 Thread Ryanne Dolan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryanne Dolan closed KAFKA-9345.
---

> Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through 
> REST API
> ---
>
> Key: KAFKA-9345
> URL: https://issues.apache.org/jira/browse/KAFKA-9345
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect, mirrormaker
>Affects Versions: 2.4.0
> Environment: runtime env:
> source-cluster:  kafka 2.2.1
> target-cluster:   kafka 2.2.1
> Mirror Maker 2.0 : kafka 2.4.0 
>Reporter: yzhou
>Assignee: yzhou
>Priority: Minor
>
> 1. Which is the best way to deploy mirror maker 2.0?  (a dedicated mm2 
> cluster or running mm2 in a connect cluster) . Could you tell me the 
> difference between them?
> 2. According to the blog or wiki 
> ([https://blog.cloudera.com/a-look-inside-kafka-mirrormaker-2/] , 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-Config,ACLSync]
>   , 
> [https://github.com/apache/kafka/blob/cae2a5e1f0779a0889f6cb43b523ebc8a812f4c2/connect/mirror/README.md]
>     ). Mirror Maker 2.0 topic supports dynamic modification of the whielist, 
> but I cannot figure out how to make it. Could you tell me how to solve this 
> problem?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9345) Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through REST API

2020-01-06 Thread Ryanne Dolan (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009040#comment-17009040
 ] 

Ryanne Dolan commented on KAFKA-9345:
-

[~xinzhuxianshenger] no inconvenience. I'll close this ticket, as I don't 
believe it represents a bug.

> Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through 
> REST API
> ---
>
> Key: KAFKA-9345
> URL: https://issues.apache.org/jira/browse/KAFKA-9345
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect, mirrormaker
>Affects Versions: 2.4.0
> Environment: runtime env:
> source-cluster:  kafka 2.2.1
> target-cluster:   kafka 2.2.1
> Mirror Maker 2.0 : kafka 2.4.0 
>Reporter: yzhou
>Assignee: yzhou
>Priority: Minor
>
> 1. Which is the best way to deploy mirror maker 2.0?  (a dedicated mm2 
> cluster or running mm2 in a connect cluster) . Could you tell me the 
> difference between them?
> 2. According to the blog or wiki 
> ([https://blog.cloudera.com/a-look-inside-kafka-mirrormaker-2/] , 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-Config,ACLSync]
>   , 
> [https://github.com/apache/kafka/blob/cae2a5e1f0779a0889f6cb43b523ebc8a812f4c2/connect/mirror/README.md]
>     ). Mirror Maker 2.0 topic supports dynamic modification of the whielist, 
> but I cannot figure out how to make it. Could you tell me how to solve this 
> problem?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9345) Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through REST API

2020-01-06 Thread Ryanne Dolan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryanne Dolan resolved KAFKA-9345.
-
Resolution: Information Provided

> Deploy Mirror Maker 2.0 and dynamically modify the topic whitelist through 
> REST API
> ---
>
> Key: KAFKA-9345
> URL: https://issues.apache.org/jira/browse/KAFKA-9345
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect, mirrormaker
>Affects Versions: 2.4.0
> Environment: runtime env:
> source-cluster:  kafka 2.2.1
> target-cluster:   kafka 2.2.1
> Mirror Maker 2.0 : kafka 2.4.0 
>Reporter: yzhou
>Assignee: yzhou
>Priority: Minor
>
> 1. Which is the best way to deploy mirror maker 2.0?  (a dedicated mm2 
> cluster or running mm2 in a connect cluster) . Could you tell me the 
> difference between them?
> 2. According to the blog or wiki 
> ([https://blog.cloudera.com/a-look-inside-kafka-mirrormaker-2/] , 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-Config,ACLSync]
>   , 
> [https://github.com/apache/kafka/blob/cae2a5e1f0779a0889f6cb43b523ebc8a812f4c2/connect/mirror/README.md]
>     ). Mirror Maker 2.0 topic supports dynamic modification of the whielist, 
> but I cannot figure out how to make it. Could you tell me how to solve this 
> problem?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9364) Fix misleading consumer logs on throttling

2020-01-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009086#comment-17009086
 ] 

ASF GitHub Bot commented on KAFKA-9364:
---

cmccabe commented on pull request #7894: KAFKA-9364: Fix misleading consumer 
logs on throttling
URL: https://github.com/apache/kafka/pull/7894
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix misleading consumer logs on throttling
> --
>
> Key: KAFKA-9364
> URL: https://issues.apache.org/jira/browse/KAFKA-9364
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Minor
>
> Fix misleading consumer logs on throttling



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9324) Drop support for Scala 2.11 (KIP-531)

2020-01-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009094#comment-17009094
 ] 

ASF GitHub Bot commented on KAFKA-9324:
---

ijuma commented on pull request #7859: KAFKA-9324: Drop support for Scala 2.11 
(KIP-531)
URL: https://github.com/apache/kafka/pull/7859
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drop support for Scala 2.11 (KIP-531)
> -
>
> Key: KAFKA-9324
> URL: https://issues.apache.org/jira/browse/KAFKA-9324
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
>  Labels: kip
> Fix For: 2.5.0
>
>
> See 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-531%3A+Drop+support+for+Scala+2.11+in+Kafka+2.5
>  for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9368) Preserve stream-time across rebalances/restarts

2020-01-06 Thread Sophie Blee-Goldman (Jira)
Sophie Blee-Goldman created KAFKA-9368:
--

 Summary: Preserve stream-time across rebalances/restarts
 Key: KAFKA-9368
 URL: https://issues.apache.org/jira/browse/KAFKA-9368
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Sophie Blee-Goldman


Stream-time is used to make decisions about processing out-of-order records or 
drop them if they are late (ie, timestamp < stream-time - grace-period). This 
is currently tracked on a per-processor basis such that each node has its own 
local view of stream-time based on the maximum timestamp it has processed.

During rebalances and restarts, stream-time is initialized as UNKNOWN (ie, -1) 
for all processors in tasks that are newly created (or migrated). In net 
effect, we forget current stream-time for this case what may lead to 
non-deterministic behavior if we stop processing right before a late record, 
that would be dropped if we continue processing, but is not dropped after 
rebalance/restart. Let's look at an examples with a grace period of 5ms for a 
tumbling windowed of 5ms, and the following records (timestamps in parenthesis):
{code:java}
r1(0) r2(5) r3(11) r4(2){code}
In the example, stream-time advances as 0, 5, 11, 11  and thus record `r4` is 
dropped as late because 2 < 6 = 11 - 5. However, if we stop processing or 
rebalance after processing `r3` but before processing `r4`, we would 
reinitialize stream-time as -1, and thus would process `r4` on restart/after 
rebalance. The problem is, that stream-time does advance differently from a 
global point of view: 0, 5, 11, 2.

Of course, this is a corner case because if we would stop processing one record 
earlier -- ie, after processing `r2` but before processing `r3` -- stream-time 
would be advanced correctly from a global point of view. 

Note that in previous versions the maximum partition-time was actually used for 
stream-time. This changed in 2.3 due to KAFKA-7895/[PR 
6278|https://github.com/apache/kafka/pull/6278], and could potentially change 
yet again in future versions (c.f. KAFKA-8769). Partition-time actually is 
preserved as of 2.4 thanks to KAFKA-7994.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9335) java.lang.IllegalArgumentException: Number of partitions must be at least 1.

2020-01-06 Thread Apurva Mehta (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apurva Mehta reassigned KAFKA-9335:
---

Assignee: Boyang Chen

> java.lang.IllegalArgumentException: Number of partitions must be at least 1.
> 
>
> Key: KAFKA-9335
> URL: https://issues.apache.org/jira/browse/KAFKA-9335
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Nitay Kufert
>Assignee: Boyang Chen
>Priority: Major
>  Labels: bug
>
> Hey,
> When trying to upgrade our Kafka streams client to 2.4.0 (from 2.3.1) we 
> encountered the following exception: 
> {code:java}
> java.lang.IllegalArgumentException: Number of partitions must be at least 1.
> {code}
> It's important to notice that the exact same code works just fine at 2.3.1.
>  
> I have created a "toy" example which reproduces this exception:
> [https://gist.github.com/nitayk/50da33b7bcce19ad0a7f8244d309cb8f]
> and I would love to get some insight regarding why its happening / ways to 
> get around it
>  
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7994) Improve Stream-Time for rebalances and restarts

2020-01-06 Thread Sophie Blee-Goldman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sophie Blee-Goldman updated KAFKA-7994:
---
Description: 
We compute a per-partition partition-time as the maximum timestamp over all 
records processed so far. Before 2.3 this was used to determine the logical 
stream-time used to make decisions about processing out-of-order records or 
drop them if they are late (ie, timestamp < stream-time - grace-period). 
Preserving the stream-time is necessary to ensure deterministic results (see 
KAFKA-9368), and although the processor-time is now used instead of 
partition-time, preserving the partition-time is a first step towards improving 
the overall stream-time semantics.

The partition-time is also used by the TimestampExtractor. It gets passed in to 
#extract and can be used to determine a rough timestamp estimate if the actual 
timestamp is missing, corrupt, etc. This means in the corner case where the 
next record to be processed after a rebalance/restart cannot have its actual 
timestamp determined, we have no idea way of coming up with a reasonable guess 
and the record will likely have to be dropped.

 

A potential fix would be, to store latest observed partition-time in the 
metadata of committed offsets. This way, on restart/rebalance we can 
re-initialize partition-time correctly.

  was:
We compute a per-partition partition-time as the maximum timestamp over all 
records processed so far. Furthermore, we use partition-time to compute 
stream-time for each task as maximum over all partition-times (for all 
corresponding task partitions). This stream-time is used to make decisions 
about processing out-of-order records or drop them if they are late (ie, 
timestamp < stream-time - grace-period).

During rebalances and restarts, stream-time is initialized as UNKNOWN (ie, -1) 
for tasks that are newly created (or migrated). In net effect, we forget 
current stream-time for this case what may lead to non-deterministic behavior 
if we stop processing right before a late record, that would be dropped if we 
continue processing, but is not dropped after rebalance/restart. Let's look at 
an examples with a grade period of 5ms for a tumbling windowed of 5ms, and the 
following records (timestamps in parenthesis):

 
{code:java}
r1(0) r2(5) r3(11) r4(2){code}
In the example, stream-time advances as 0, 5, 11, 11  and thus record `r4` is 
dropped as late because 2 < 6 = 11 - 5. However, if we stop processing or 
rebalance after processing `r3` but before processing `r4`, we would 
reinitialize stream-time as -1, and thus would process `r4` on restart/after 
rebalance. The problem is, that stream-time does advance differently from a 
global point of view: 0, 5, 11, 2.

Note, this is a corner case, because if we would stop processing one record 
earlier, ie, after processing `r2` but before processing `r3`, stream-time 
would be advance correctly from a global point of view.

A potential fix would be, to store latest observed partition-time in the 
metadata of committed offsets. Thus way, on restart/rebalance we can 
re-initialize time correctly.


> Improve Stream-Time for rebalances and restarts
> ---
>
> Key: KAFKA-7994
> URL: https://issues.apache.org/jira/browse/KAFKA-7994
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Richard Yu
>Priority: Major
> Fix For: 2.4.0
>
> Attachments: possible-patch.diff
>
>
> We compute a per-partition partition-time as the maximum timestamp over all 
> records processed so far. Before 2.3 this was used to determine the logical 
> stream-time used to make decisions about processing out-of-order records or 
> drop them if they are late (ie, timestamp < stream-time - grace-period). 
> Preserving the stream-time is necessary to ensure deterministic results (see 
> KAFKA-9368), and although the processor-time is now used instead of 
> partition-time, preserving the partition-time is a first step towards 
> improving the overall stream-time semantics.
> The partition-time is also used by the TimestampExtractor. It gets passed in 
> to #extract and can be used to determine a rough timestamp estimate if the 
> actual timestamp is missing, corrupt, etc. This means in the corner case 
> where the next record to be processed after a rebalance/restart cannot have 
> its actual timestamp determined, we have no idea way of coming up with a 
> reasonable guess and the record will likely have to be dropped.
>  
> A potential fix would be, to store latest observed partition-time in the 
> metadata of committed offsets. This way, on restart/rebalance we can 
> re-initialize partition-time correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7994) Improve Partition-Time for rebalances and restarts

2020-01-06 Thread Sophie Blee-Goldman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sophie Blee-Goldman updated KAFKA-7994:
---
Summary: Improve Partition-Time for rebalances and restarts  (was: Improve 
Stream-Time for rebalances and restarts)

> Improve Partition-Time for rebalances and restarts
> --
>
> Key: KAFKA-7994
> URL: https://issues.apache.org/jira/browse/KAFKA-7994
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Richard Yu
>Priority: Major
> Fix For: 2.4.0
>
> Attachments: possible-patch.diff
>
>
> We compute a per-partition partition-time as the maximum timestamp over all 
> records processed so far. Before 2.3 this was used to determine the logical 
> stream-time used to make decisions about processing out-of-order records or 
> drop them if they are late (ie, timestamp < stream-time - grace-period). 
> Preserving the stream-time is necessary to ensure deterministic results (see 
> KAFKA-9368), and although the processor-time is now used instead of 
> partition-time, preserving the partition-time is a first step towards 
> improving the overall stream-time semantics.
> The partition-time is also used by the TimestampExtractor. It gets passed in 
> to #extract and can be used to determine a rough timestamp estimate if the 
> actual timestamp is missing, corrupt, etc. This means in the corner case 
> where the next record to be processed after a rebalance/restart cannot have 
> its actual timestamp determined, we have no idea way of coming up with a 
> reasonable guess and the record will likely have to be dropped.
>  
> A potential fix would be, to store latest observed partition-time in the 
> metadata of committed offsets. This way, on restart/rebalance we can 
> re-initialize partition-time correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9368) Preserve stream-time across rebalances/restarts

2020-01-06 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009121#comment-17009121
 ] 

Sophie Blee-Goldman commented on KAFKA-9368:


In KAFKA-7994 we fixed the partition-time for rebalances/restarts, but since 
stream-time is now determined on a per-processor basis and not by the 
partition-time we should track the issue separately. Thus I split the "preserve 
stream-time" aspect of KAFKA-7994 to a new ticket (see original ticket for full 
discussion).

> Preserve stream-time across rebalances/restarts
> ---
>
> Key: KAFKA-9368
> URL: https://issues.apache.org/jira/browse/KAFKA-9368
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Sophie Blee-Goldman
>Priority: Major
>
> Stream-time is used to make decisions about processing out-of-order records 
> or drop them if they are late (ie, timestamp < stream-time - grace-period). 
> This is currently tracked on a per-processor basis such that each node has 
> its own local view of stream-time based on the maximum timestamp it has 
> processed.
> During rebalances and restarts, stream-time is initialized as UNKNOWN (ie, 
> -1) for all processors in tasks that are newly created (or migrated). In net 
> effect, we forget current stream-time for this case what may lead to 
> non-deterministic behavior if we stop processing right before a late record, 
> that would be dropped if we continue processing, but is not dropped after 
> rebalance/restart. Let's look at an examples with a grace period of 5ms for a 
> tumbling windowed of 5ms, and the following records (timestamps in 
> parenthesis):
> {code:java}
> r1(0) r2(5) r3(11) r4(2){code}
> In the example, stream-time advances as 0, 5, 11, 11  and thus record `r4` is 
> dropped as late because 2 < 6 = 11 - 5. However, if we stop processing or 
> rebalance after processing `r3` but before processing `r4`, we would 
> reinitialize stream-time as -1, and thus would process `r4` on restart/after 
> rebalance. The problem is, that stream-time does advance differently from a 
> global point of view: 0, 5, 11, 2.
> Of course, this is a corner case because if we would stop processing one 
> record earlier -- ie, after processing `r2` but before processing `r3` -- 
> stream-time would be advanced correctly from a global point of view. 
> Note that in previous versions the maximum partition-time was actually used 
> for stream-time. This changed in 2.3 due to KAFKA-7895/[PR 
> 6278|https://github.com/apache/kafka/pull/6278], and could potentially change 
> yet again in future versions (c.f. KAFKA-8769). Partition-time actually is 
> preserved as of 2.4 thanks to KAFKA-7994.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-7994) Improve Partition-Time for rebalances and restarts

2020-01-06 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009128#comment-17009128
 ] 

Sophie Blee-Goldman commented on KAFKA-7994:


Since we fixed part of this issue but not the full scope since partition-time 
is no longer used to determine stream-time, I've updated the description to 
cover only the preservation of partition-time (which was fixed for 2.4). The 
remaining work w.r.t preserving stream-time was broken out into a new ticket so 
we can track that separately. See KAFKA-9368

> Improve Partition-Time for rebalances and restarts
> --
>
> Key: KAFKA-7994
> URL: https://issues.apache.org/jira/browse/KAFKA-7994
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Richard Yu
>Priority: Major
> Fix For: 2.4.0
>
> Attachments: possible-patch.diff
>
>
> We compute a per-partition partition-time as the maximum timestamp over all 
> records processed so far. Before 2.3 this was used to determine the logical 
> stream-time used to make decisions about processing out-of-order records or 
> drop them if they are late (ie, timestamp < stream-time - grace-period). 
> Preserving the stream-time is necessary to ensure deterministic results (see 
> KAFKA-9368), and although the processor-time is now used instead of 
> partition-time, preserving the partition-time is a first step towards 
> improving the overall stream-time semantics.
> The partition-time is also used by the TimestampExtractor. It gets passed in 
> to #extract and can be used to determine a rough timestamp estimate if the 
> actual timestamp is missing, corrupt, etc. This means in the corner case 
> where the next record to be processed after a rebalance/restart cannot have 
> its actual timestamp determined, we have no idea way of coming up with a 
> reasonable guess and the record will likely have to be dropped.
>  
> A potential fix would be, to store latest observed partition-time in the 
> metadata of committed offsets. This way, on restart/rebalance we can 
> re-initialize partition-time correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9369) Allow Consumers and Producers to Connect with User-Agent String

2020-01-06 Thread David Mollitor (Jira)
David Mollitor created KAFKA-9369:
-

 Summary: Allow Consumers and Producers to Connect with User-Agent 
String
 Key: KAFKA-9369
 URL: https://issues.apache.org/jira/browse/KAFKA-9369
 Project: Kafka
  Issue Type: Improvement
Reporter: David Mollitor


Given the adhoc nature of consumers and producers in Kafka, it can be difficult 
to track where connections to brokers and partitions are coming from.

 

Please allow consumers and producers to pass an optional _user-agent_ string 
during the connection process so that they can quickly and accurately be 
identified.  For example, if I am performing an upgrade on my consumers, I want 
to be able to see that no consumers with an older version number of the 
consuming software still exist or if I see an application that is configured to 
consumer from the wrong consumer group, they can quickly be identified and 
removed.

 

[https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9369) Allow Consumers and Producers to Connect with User-Agent String

2020-01-06 Thread Andrew Otto (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009172#comment-17009172
 ] 

Andrew Otto commented on KAFKA-9369:


You could use the client.id configuration property when you instantiate your 
client

[https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-Requests]

 

[https://groups.google.com/forum/#!topic/kafka-clients/O2POKeq1EUE]

 

 

 

> Allow Consumers and Producers to Connect with User-Agent String
> ---
>
> Key: KAFKA-9369
> URL: https://issues.apache.org/jira/browse/KAFKA-9369
> Project: Kafka
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Minor
>
> Given the adhoc nature of consumers and producers in Kafka, it can be 
> difficult to track where connections to brokers and partitions are coming 
> from.
>  
> Please allow consumers and producers to pass an optional _user-agent_ string 
> during the connection process so that they can quickly and accurately be 
> identified.  For example, if I am performing an upgrade on my consumers, I 
> want to be able to see that no consumers with an older version number of the 
> consuming software still exist or if I see an application that is configured 
> to consumer from the wrong consumer group, they can quickly be identified and 
> removed.
>  
> [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress

2020-01-06 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-9370:
--

 Summary: Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in 
progress
 Key: KAFKA-9370
 URL: https://issues.apache.org/jira/browse/KAFKA-9370
 Project: Kafka
  Issue Type: Bug
Reporter: Vikas Singh


`KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` if 
the topic is getting deleted. Change it to return `UNKNOWN_TOPIC_OR_PARTITION` 
instead. After the delete topic api returns, client should see the topic as 
deleted. The fact that we are processing deletion in background shouldn't have 
any impact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress

2020-01-06 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-9370:
--

Assignee: Vikas Singh

> Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
> --
>
> Key: KAFKA-9370
> URL: https://issues.apache.org/jira/browse/KAFKA-9370
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` 
> if the topic is getting deleted. Change it to return 
> `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, 
> client should see the topic as deleted. The fact that we are processing 
> deletion in background shouldn't have any impact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9369) Allow Consumers and Producers to Connect with User-Agent String

2020-01-06 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009192#comment-17009192
 ] 

David Mollitor commented on KAFKA-9369:
---

[~ottomata] Thanks for that.  I am familiar with that configuration.  However, 
there is no corollary for the producers.

 

[https://jaceklaskowski.gitbooks.io/apache-kafka/kafka-properties-client-id.html]

> Allow Consumers and Producers to Connect with User-Agent String
> ---
>
> Key: KAFKA-9369
> URL: https://issues.apache.org/jira/browse/KAFKA-9369
> Project: Kafka
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Minor
>
> Given the adhoc nature of consumers and producers in Kafka, it can be 
> difficult to track where connections to brokers and partitions are coming 
> from.
>  
> Please allow consumers and producers to pass an optional _user-agent_ string 
> during the connection process so that they can quickly and accurately be 
> identified.  For example, if I am performing an upgrade on my consumers, I 
> want to be able to see that no consumers with an older version number of the 
> consuming software still exist or if I see an application that is configured 
> to consumer from the wrong consumer group, they can quickly be identified and 
> removed.
>  
> [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9348) Broker shutdown during controller initialization can lead to zombie broker

2020-01-06 Thread David Arthur (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009229#comment-17009229
 ] 

David Arthur commented on KAFKA-9348:
-

I talked with [~hachikuji] about this and after looking through the code, it 
seems like this shouldn't be able to happen since things are serialized through 
the event queue in the Controller. We'll leave this issue open in case we see 
this again and can capture some logs.

> Broker shutdown during controller initialization can lead to zombie broker
> --
>
> Key: KAFKA-9348
> URL: https://issues.apache.org/jira/browse/KAFKA-9348
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Priority: Major
>
> It is possible that a broker may be shutdown while it is in the process of 
> becoming the controller. There is no protection currently to ensure that 
> initialization doesn't interfere with shutdown. An example of this is the 
> shutdown of the controller channel manager. It is possible that the request 
> send threads are restarted by the initialization logic _after_ the shutdown 
> method has returned. In this case, there will be no call to 
> `initiateShutdown` on any newly created send threads which will leave the 
> shutdown hook hanging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9335) java.lang.IllegalArgumentException: Number of partitions must be at least 1.

2020-01-06 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9335:
---
Priority: Blocker  (was: Major)

> java.lang.IllegalArgumentException: Number of partitions must be at least 1.
> 
>
> Key: KAFKA-9335
> URL: https://issues.apache.org/jira/browse/KAFKA-9335
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Nitay Kufert
>Assignee: Boyang Chen
>Priority: Blocker
>  Labels: bug
>
> Hey,
> When trying to upgrade our Kafka streams client to 2.4.0 (from 2.3.1) we 
> encountered the following exception: 
> {code:java}
> java.lang.IllegalArgumentException: Number of partitions must be at least 1.
> {code}
> It's important to notice that the exact same code works just fine at 2.3.1.
>  
> I have created a "toy" example which reproduces this exception:
> [https://gist.github.com/nitayk/50da33b7bcce19ad0a7f8244d309cb8f]
> and I would love to get some insight regarding why its happening / ways to 
> get around it
>  
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8764) LogCleanerManager endless loop while compacting/cleaning segments

2020-01-06 Thread Jeff Nadler (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009262#comment-17009262
 ] 

Jeff Nadler commented on KAFKA-8764:


I wanted to throw my .02 in here:   We are seeing the same issue, also with 
compact topics.   The compact topics in question are quite low traffic.

LogCleaner is running flat-out, with only a few ms between attempts to clean a 
single topic-partition.

The log cleaner thread is consuming almost an entire core:
{code:java}
14857 kafka     20   0 8894844 2.187g  12396 R 87.5 56.8  47:04.92 
kafka-log-clean 
{code}
We're running 2.4.0, openjdk11, happy to provide any add'l info to help with 
this issue.   

 

> LogCleanerManager endless loop while compacting/cleaning segments
> -
>
> Key: KAFKA-8764
> URL: https://issues.apache.org/jira/browse/KAFKA-8764
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 2.3.0, 2.2.1
> Environment: docker base image: openjdk:8-jre-alpine base image, 
> kafka from http://ftp.carnet.hr/misc/apache/kafka/2.2.1/kafka_2.12-2.2.1.tgz
>Reporter: Tomislav Rajakovic
>Priority: Major
> Attachments: log-cleaner-bug-reproduction.zip
>
>
> {{LogCleanerManager stuck in endless loop while clearing segments for one 
> partition resulting with many log outputs and heavy disk read/writes/IOPS.}}
>  
> Issue appeared on follower brokers, and it happens on every (new) broker if 
> partition assignment is changed.
>  
> Original issue setup:
>  * kafka_2.12-2.2.1 deployed as statefulset on kubernetes, 5 brokers
>  * log directory is (AWS) EBS mounted PV, gp2 (ssd) kind of 750GiB
>  * 5 zookeepers
>  * topic created with config:
>  ** name = "backup_br_domain_squad"
> partitions = 36
> replication_factor = 3
> config = {
>  "cleanup.policy" = "compact"
>  "min.compaction.lag.ms" = "8640"
>  "min.cleanable.dirty.ratio" = "0.3"
> }
>  
>  
> Log excerpt:
> {{[2019-08-07 12:10:53,895] INFO [Log partition=backup_br_domain_squad-14, 
> dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
> {{[2019-08-07 12:10:53,895] INFO Deleted log 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.log.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:53,896] INFO Deleted offset index 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.index.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:53,896] INFO Deleted time index 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.timeindex.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:53,964] INFO [Log partition=backup_br_domain_squad-14, 
> dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
> {{[2019-08-07 12:10:53,964] INFO Deleted log 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.log.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:53,964] INFO Deleted offset index 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.index.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:53,964] INFO Deleted time index 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.timeindex.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:54,031] INFO [Log partition=backup_br_domain_squad-14, 
> dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
> {{[2019-08-07 12:10:54,032] INFO Deleted log 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.log.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:54,032] INFO Deleted offset index 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.index.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:54,032] INFO Deleted time index 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.timeindex.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:54,101] INFO [Log partition=backup_br_domain_squad-14, 
> dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
> {{[2019-08-07 12:10:54,101] INFO Deleted log 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.log.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:54,101] INFO Deleted offset index 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.index.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:54,101] INFO Deleted time index 
> /var/lib/kafka/data/topics/backup_br_domain_squad-14/.timeindex.deleted.
>  (kafka.log.LogSegment)}}
> {{[2019-08-07 12:10:54,173] INFO [Log partition=backup_br_domain_squad-14, 
> dir=/var/lib/kafka/data/topics] Deleting segment 0 (kafka.log.Log)}}
> {{[2019-08-07 12:10:54,173] I

[jira] [Created] (KAFKA-9371) Disk space is not released after Kafka clears data due to retention settings

2020-01-06 Thread Arda Savran (Jira)
Arda Savran created KAFKA-9371:
--

 Summary: Disk space is not released after Kafka clears data due to 
retention settings
 Key: KAFKA-9371
 URL: https://issues.apache.org/jira/browse/KAFKA-9371
 Project: Kafka
  Issue Type: Bug
  Components: core, streams
Affects Versions: 2.2.0
 Environment: CentOS 7.7
Reporter: Arda Savran


We defined retention time on topics for 15 minutes. It looks like Kafka is 
deleting the messages as configured however the disk space is not restored. 
"df" output shows 30G for kafka-logs instead of the real size which is supposed 
to be 1Gb. 

 {{/usr/sbin/lsof | grep deleted }}

output shows a bunch of files under kafka-logs that are deleted but they are 
still consuming space.

Is this a known issue? Is there a setting that I can apply to kafka broker 
server?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9372) Add producer config to make topicExpiry configurable

2020-01-06 Thread Jiao Zhang (Jira)
Jiao Zhang created KAFKA-9372:
-

 Summary: Add producer config to make topicExpiry configurable
 Key: KAFKA-9372
 URL: https://issues.apache.org/jira/browse/KAFKA-9372
 Project: Kafka
  Issue Type: Improvement
  Components: producer 
Affects Versions: 1.1.0
Reporter: Jiao Zhang


Sometimes we got error "org.apache.kafka.common.errors.TimeoutException: Failed 
to update metadata after 1000 ms" on producer side. We did the investigation 
and found
 # our producer produced messages in really low rate, the interval is more than 
10 minutes
 # by default, producer would expire topics after TOPIC_EXPIRY_MS, after topic 
expired if no data produce before next metadata update (automatically triggered 
by metadata.max.age.ms) partitions entry for the topic would disappear from the 
Metadata cache As a result, almost for every time's produce, producer need 
fetch metadata which could possibly end with timeout.

To solve this, we propose to add a new config metadata.topic.expiry for 
producer to make topicExpiry configurable. Topic expiry is good only when 
producer is long-lived and is used for producing variable counts of topics. But 
in the case that producers are bounded to single or few fixed topics, there is 
no need to expire topics at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9373) Improve shutdown performance via lazy accessing the offset and time indices.

2020-01-06 Thread Adem Efe Gencer (Jira)
Adem Efe Gencer created KAFKA-9373:
--

 Summary: Improve shutdown performance via lazy accessing the 
offset and time indices.
 Key: KAFKA-9373
 URL: https://issues.apache.org/jira/browse/KAFKA-9373
 Project: Kafka
  Issue Type: Bug
  Components: log
Affects Versions: 2.3.1, 2.4.0, 2.3.0
Reporter: Adem Efe Gencer
Assignee: Adem Efe Gencer
 Fix For: 2.3.1, 2.4.0, 2.3.0


KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand 
rather than performing costly disk/memory operations when creating all indices 
on broker startup. This helped reducing the startup time of brokers. However, 
segment indices are still created on closing segments, regardless of whether 
they need to be closed or not.
 
Ideally we should:
 * Improve shutdown performance via lazy accessing the offset and time indices.
 * Eliminate redundant disk accesses and memory mapped operations while 
deleting or renaming files that back segment indices.
 * Prevent illegal accesses to underlying indices of a closed segment, which 
would lead to memory leaks due to recreation of the underlying memory mapped 
objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9373) Improve shutdown performance via lazy accessing the offset and time indices.

2020-01-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009314#comment-17009314
 ] 

ASF GitHub Bot commented on KAFKA-9373:
---

efeg commented on pull request #7900: KAFKA-9373: Improve shutdown performance 
via lazy accessing the offset and time indices
URL: https://github.com/apache/kafka/pull/7900
 
 
   KAFKA-7283 enabled lazy mmap on index files by initializing indices 
on-demand rather than performing costly disk/memory operations when creating 
all indices on broker startup. This helped reducing the startup time of 
brokers. However, segment indices are still created on closing segments, 
regardless of whether they need to be closed or not.
   
   This patch:
   * Improves shutdown performance via lazy accessing the offset and time 
indices.
   * Eliminates redundant disk accesses and memory mapped operations while 
deleting or renaming files that back segment indices.
   * Prevents illegal accesses to underlying indices of a closed segment, which 
would lead to memory leaks due to recreation of the underlying memory mapped 
objects.
   
   In our evaluations in a cluster with 31 brokers, where each broker has 13K 
to 20K segments, we observed up to 2 orders of magnitude faster LogManager 
shutdown times with this patch -- i.e. dropping the LogManager shutdown time of 
each broker from 10s of seconds to 100s of milliseconds.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve shutdown performance via lazy accessing the offset and time indices.
> 
>
> Key: KAFKA-9373
> URL: https://issues.apache.org/jira/browse/KAFKA-9373
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 2.3.0, 2.4.0, 2.3.1
>Reporter: Adem Efe Gencer
>Assignee: Adem Efe Gencer
>Priority: Major
> Fix For: 2.3.0, 2.4.0, 2.3.1
>
>
> KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand 
> rather than performing costly disk/memory operations when creating all 
> indices on broker startup. This helped reducing the startup time of brokers. 
> However, segment indices are still created on closing segments, regardless of 
> whether they need to be closed or not.
>  
> Ideally we should:
>  * Improve shutdown performance via lazy accessing the offset and time 
> indices.
>  * Eliminate redundant disk accesses and memory mapped operations while 
> deleting or renaming files that back segment indices.
>  * Prevent illegal accesses to underlying indices of a closed segment, which 
> would lead to memory leaks due to recreation of the underlying memory mapped 
> objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9351) Higher count in destination cluster using Kafka MM2

2020-01-06 Thread Nitish Goyal (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009365#comment-17009365
 ] 

Nitish Goyal commented on KAFKA-9351:
-

[~ryannedolan] These are event counts

At times, I only see a small difference in events count (which is well 
explained by retries in producer). But few times, I see a huge difference in 
events counts (which I am worried about)

Eg : A partition with few million events would have difference of few hundred 
thousands records

 

Is this an expected behavior? Can producer retries can cause difference of few 
hundred thousands records in destination cluster?

> Higher count in destination cluster using Kafka MM2
> ---
>
> Key: KAFKA-9351
> URL: https://issues.apache.org/jira/browse/KAFKA-9351
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Affects Versions: 2.4.0
>Reporter: Nitish Goyal
>Priority: Minor
>
> I have setup replication between cluster across different data centres. After 
> setting up replication, at times, I am observing higher event count in 
> destination cluster
> Below are counts in source and destination cluster
>  
> *Source Cluster*
> ```
>  
> events_4:0:51048
> events_4:1:52250
> events_4:2:51526
> ```
>  
> *Destination Cluster*
> ```
> nm5.events_4:0:53289
> nm5.events_4:1:54569
> nm5.events_4:2:53733
> ```
>  
> This is a blocker for us to start using MM2 replicatior



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9369) Allow Consumers and Producers to Connect with User-Agent String

2020-01-06 Thread Nikolay Izhikov (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009387#comment-17009387
 ] 

Nikolay Izhikov commented on KAFKA-9369:


[~belugabehr]

You can find documentation for the "client.id" for producers here - 
http://kafka.apache.org/documentation.html#producerconfigs
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java#L128
 - option in source code.

> Allow Consumers and Producers to Connect with User-Agent String
> ---
>
> Key: KAFKA-9369
> URL: https://issues.apache.org/jira/browse/KAFKA-9369
> Project: Kafka
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Minor
>
> Given the adhoc nature of consumers and producers in Kafka, it can be 
> difficult to track where connections to brokers and partitions are coming 
> from.
>  
> Please allow consumers and producers to pass an optional _user-agent_ string 
> during the connection process so that they can quickly and accurately be 
> identified.  For example, if I am performing an upgrade on my consumers, I 
> want to be able to see that no consumers with an older version number of the 
> consuming software still exist or if I see an application that is configured 
> to consumer from the wrong consumer group, they can quickly be identified and 
> removed.
>  
> [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9374) Worker can be disabled by blocked connectors

2020-01-06 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-9374:


 Summary: Worker can be disabled by blocked connectors
 Key: KAFKA-9374
 URL: https://issues.apache.org/jira/browse/KAFKA-9374
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 2.3.1, 2.4.0, 2.2.2, 2.2.1, 2.3.0, 2.1.1, 2.2.0, 2.1.0, 
2.0.1, 2.0.0, 1.1.1, 1.1.0, 1.0.2, 1.0.1, 1.0.0
Reporter: Chris Egerton
Assignee: Chris Egerton


If a connector hangs during any of its {{initialize}}, {{start}}, {{stop}}, 
\{taskConfigs}}, {{taskClass}}, {{version}}, {{config}}, or {{validate}} 
methods, the worker will be disabled for some types of requests thereafter, 
including connector creation, connector reconfiguration, and connector deletion.
 This only occurs in distributed mode and is due to the threading model used by 
the 
[DistributedHerder|https://github.com/apache/kafka/blob/03f763df8a8d9482d8c099806336f00cf2521465/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java]
 class.

 

One potential solution could be to treat connectors that fail to start, stop, 
etc. in time similarly to tasks that fail to stop within the [task graceful 
shutdown timeout 
period|https://github.com/apache/kafka/blob/03f763df8a8d9482d8c099806336f00cf2521465/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConfig.java#L121-L126]
 by handling all connector interactions on a separate thread, waiting for them 
to complete within a timeout, and abandoning the thread (and transitioning the 
connector to the {{FAILED}} state, if it has been created at all) if that 
timeout expires.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)