[jira] [Resolved] (KAFKA-8805) Bump producer epoch following recoverable errors

2020-02-15 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-8805.

Fix Version/s: 2.5.0
   Resolution: Fixed

> Bump producer epoch following recoverable errors
> 
>
> Key: KAFKA-8805
> URL: https://issues.apache.org/jira/browse/KAFKA-8805
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 2.3.0
>Reporter: Bob Barrett
>Assignee: Bob Barrett
>Priority: Major
> Fix For: 2.5.0
>
>
> As part of KIP-360, the producer needs to call the new InitProducerId API 
> after receiving UNKNOWN_PRODUCER_ID and INVALID_PRODUCER_MAPPING errors, 
> which will allow the producers to bump their epoch and continue processing 
> unless a new producer has already initialized a new producer ID.
> The broker change that this depends on is 
> https://issues.apache.org/jira/browse/KAFKA-8710.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8805) Bump producer epoch following recoverable errors

2020-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037715#comment-17037715
 ] 

ASF GitHub Bot commented on KAFKA-8805:
---

hachikuji commented on pull request #7389: KAFKA-8805: Bump producer epoch on 
recoverable errors
URL: https://github.com/apache/kafka/pull/7389
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bump producer epoch following recoverable errors
> 
>
> Key: KAFKA-8805
> URL: https://issues.apache.org/jira/browse/KAFKA-8805
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 2.3.0
>Reporter: Bob Barrett
>Assignee: Bob Barrett
>Priority: Major
>
> As part of KIP-360, the producer needs to call the new InitProducerId API 
> after receiving UNKNOWN_PRODUCER_ID and INVALID_PRODUCER_MAPPING errors, 
> which will allow the producers to bump their epoch and continue processing 
> unless a new producer has already initialized a new producer ID.
> The broker change that this depends on is 
> https://issues.apache.org/jira/browse/KAFKA-8710.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9552) Stream should handle OutOfSequence exception thrown from Producer

2020-02-15 Thread John Roesler (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037691#comment-17037691
 ] 

John Roesler commented on KAFKA-9552:
-

Thanks, Matthias. What you say sounds right. I’m wondering if even aborting the 
transaction would be good enough, since the lost writes may have been from a 
previous transaction.

However, based on what you said, it does seem like the logic we have to 
rebalance in response may not be the right choice. 

> Stream should handle OutOfSequence exception thrown from Producer
> -
>
> Key: KAFKA-9552
> URL: https://issues.apache.org/jira/browse/KAFKA-9552
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Boyang Chen
>Priority: Major
>
> As of today the stream thread could die from OutOfSequence error:
> {code:java}
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) [2020-02-12 
> 15:14:35,185] ERROR 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] 
> stream-thread 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] Failed 
> to commit stream task 3_2 due to the following error: 
> (org.apache.kafka.streams.processor.internals.AssignedStreamsTasks)
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.streams.errors.StreamsException: task [3_2] Abort sending 
> since an error caught with a previous record (timestamp 1581484094825) to 
> topic stream-soak-test-KSTREAM-AGGREGATE-STATE-STORE-49-changelog due 
> to org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:154)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:214)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1353)
> {code}
>  Although this is fatal exception for Producer, stream should treat it as an 
> opportunity to reinitialize by doing a rebalance, instead of killing 
> computation resource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9552) Stream should handle OutOfSequence exception thrown from Producer

2020-02-15 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037683#comment-17037683
 ] 

Matthias J. Sax commented on KAFKA-9552:


I am not sure if we need to re-balance – if we would have missed a rebalance 
and lost the task, we would get a `ProducerFencedException`. Hence, on this 
error we should still be part of the consumer group.

>From my understanding an `OutOfOrderSequenceExceptio`n implies data loss, ie, 
>we got an ack back, but on the next send data is not in the log (this could 
>happen if unclean leader election is enabled broker side) – otherwise should 
>indicate a severe bug.

While we could abort the current transaction, and reinitialize the task (ie, 
refetch the input topic offsets, cleanup the state etc), I am wondering if we 
should do this as it would mask a bug? Instead, it might be better to not catch 
and fail fast thus we can report this error?

Btw: In `RecordCollectorImpl` in a recent PR we started to catch 
`OutOfOrderSequenceException` and rethrow `TaskMigratedException` for this case 
– however, I am not sure if we should keep this change or roll it back for the 
same reason.

\cc [~guozhang] [~hachikuji]

> Stream should handle OutOfSequence exception thrown from Producer
> -
>
> Key: KAFKA-9552
> URL: https://issues.apache.org/jira/browse/KAFKA-9552
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Boyang Chen
>Priority: Major
>
> As of today the stream thread could die from OutOfSequence error:
> {code:java}
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) [2020-02-12 
> 15:14:35,185] ERROR 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] 
> stream-thread 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] Failed 
> to commit stream task 3_2 due to the following error: 
> (org.apache.kafka.streams.processor.internals.AssignedStreamsTasks)
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.streams.errors.StreamsException: task [3_2] Abort sending 
> since an error caught with a previous record (timestamp 1581484094825) to 
> topic stream-soak-test-KSTREAM-AGGREGATE-STATE-STORE-49-changelog due 
> to org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:154)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:214)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1353)
> {code}
>  Although this is fatal exception for Producer, stream should treat it as an 
> opportunity to reinitialize by doing a rebalance, instead of killing 
> computation resource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-9552) Stream should handle OutOfSequence exception thrown from Producer

2020-02-15 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037683#comment-17037683
 ] 

Matthias J. Sax edited comment on KAFKA-9552 at 2/16/20 1:22 AM:
-

I am not sure if we need to re-balance – if we would have missed a rebalance 
and lost the task, we would get a `ProducerFencedException`. Hence, on this 
error we should still be part of the consumer group.

>From my understanding an `OutOfOrderSequenceExceptio`n implies data loss, ie, 
>we got an ack back, but on the next send data is not in the log (this could 
>happen if unclean leader election is enabled broker side) – otherwise should 
>indicate a severe bug.

While we could abort the current transaction, and reinitialize the task (ie, 
refetch the input topic offsets, cleanup the state etc), I am wondering if we 
should do this as it would mask a bug? Instead, it might be better to not catch 
and fail fast thus we can report this error?

Btw: In `RecordCollectorImpl` in a recent PR we started to catch 
`OutOfOrderSequenceException` and rethrow `TaskMigratedException` for this case 
– however, I am not sure if we should keep this change or roll it back for the 
same reason.

\cc [~guozhang] [~hachikuji] [~vvcephei]


was (Author: mjsax):
I am not sure if we need to re-balance – if we would have missed a rebalance 
and lost the task, we would get a `ProducerFencedException`. Hence, on this 
error we should still be part of the consumer group.

>From my understanding an `OutOfOrderSequenceExceptio`n implies data loss, ie, 
>we got an ack back, but on the next send data is not in the log (this could 
>happen if unclean leader election is enabled broker side) – otherwise should 
>indicate a severe bug.

While we could abort the current transaction, and reinitialize the task (ie, 
refetch the input topic offsets, cleanup the state etc), I am wondering if we 
should do this as it would mask a bug? Instead, it might be better to not catch 
and fail fast thus we can report this error?

Btw: In `RecordCollectorImpl` in a recent PR we started to catch 
`OutOfOrderSequenceException` and rethrow `TaskMigratedException` for this case 
– however, I am not sure if we should keep this change or roll it back for the 
same reason.

\cc [~guozhang] [~hachikuji]

> Stream should handle OutOfSequence exception thrown from Producer
> -
>
> Key: KAFKA-9552
> URL: https://issues.apache.org/jira/browse/KAFKA-9552
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Boyang Chen
>Priority: Major
>
> As of today the stream thread could die from OutOfSequence error:
> {code:java}
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) [2020-02-12 
> 15:14:35,185] ERROR 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] 
> stream-thread 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] Failed 
> to commit stream task 3_2 due to the following error: 
> (org.apache.kafka.streams.processor.internals.AssignedStreamsTasks)
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.streams.errors.StreamsException: task [3_2] Abort sending 
> since an error caught with a previous record (timestamp 1581484094825) to 
> topic stream-soak-test-KSTREAM-AGGREGATE-STATE-STORE-49-changelog due 
> to org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:154)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:214)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1353)
> {code}
>  Although this is fatal exception for Producer, stream should treat it as an 
> opportunity to reinitialize by doing a rebalance, instead of killing 
> computation resource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9373) Improve shutdown performance via lazy accessing the offset and time indices.

2020-02-15 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-9373:
---
Fix Version/s: (was: 2.3.1)
   (was: 2.4.0)
   (was: 2.3.0)

> Improve shutdown performance via lazy accessing the offset and time indices.
> 
>
> Key: KAFKA-9373
> URL: https://issues.apache.org/jira/browse/KAFKA-9373
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 2.3.0, 2.4.0, 2.3.1
>Reporter: Adem Efe Gencer
>Assignee: Adem Efe Gencer
>Priority: Major
>
> KAFKA-7283 enabled lazy mmap on index files by initializing indices on-demand 
> rather than performing costly disk/memory operations when creating all 
> indices on broker startup. This helped reducing the startup time of brokers. 
> However, segment indices are still created on closing segments, regardless of 
> whether they need to be closed or not.
>  
> Ideally we should:
>  * Improve shutdown performance via lazy accessing the offset and time 
> indices.
>  * Eliminate redundant disk accesses and memory mapped operations while 
> deleting or renaming files that back segment indices.
>  * Prevent illegal accesses to underlying indices of a closed segment, which 
> would lead to memory leaks due to recreation of the underlying memory mapped 
> objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9300) Create a topic based on the specified brokers

2020-02-15 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-9300:
---
Fix Version/s: (was: 2.4.0)

> Create a topic based on the specified brokers
> -
>
> Key: KAFKA-9300
> URL: https://issues.apache.org/jira/browse/KAFKA-9300
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients
>Affects Versions: 2.3.0
>Reporter: weiwei
>Assignee: weiwei
>Priority: Major
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> Generally, A Kafka cluster serves multiple businesses. To reduce the impact 
> of businesses, many companies isolate brokers to physically isolate 
> businesses. That is, the topics of certain businesses are created on the 
> specified brokers. The current topic creation script supports only create 
> topic according replica-assignment . This function is not convenient for the 
> service to specify the brokers. Therefore, you need to add this function as 
> follows: Create a topci based on the specified brokers. The 
> replica-assignment-brokers parameter is added to indicate the broker range of 
> the topic distribution. If this parameter is not set, all broker nodes in the 
> cluster are used. For example, kafka-topics.sh --create --topic test06 
> --partitions 2 --replication-factor 1 --zookeeper zkurl -- 
> --replica-assignment-brokers=1,2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9532) Deleting the consumer group programatically using RESTFul API

2020-02-15 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-9532:
---
Fix Version/s: (was: 2.4.0)

> Deleting the consumer group programatically using RESTFul API
> -
>
> Key: KAFKA-9532
> URL: https://issues.apache.org/jira/browse/KAFKA-9532
> Project: Kafka
>  Issue Type: Wish
>  Components: clients
>Affects Versions: 2.4.0
>Reporter: Rakshith  Mamidala
>Priority: Major
>
> As a requirement in project, instead of listening the messages and consuming 
> / storing message data into database, we are creating the consumer groups run 
> time per user (to avoid thread safe issue) and using consumer.poll and 
> consumer.seekToBeginning and once read all the messages we are closing the 
> connection, unsubscribing consumer group. 
>  
> Whats happening in Kafka is, the consumer groups moved from active state to 
> DEAD state but not getting removed / deleted, in Kafka Tools it shows all the 
> consumers even if those are DEAD.
>  
> *What we want:*
>  # How to remove / delete the consumer groups programatically.
>  # Is there any REST Endpoint / command line / script to delete the consumer 
> groups? What are those.
>  # What impact the DEAD consumer groups can creates in terms of production 
> environment.?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9515) Upgrade ZooKeeper to 3.5.7

2020-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037599#comment-17037599
 ] 

ASF GitHub Bot commented on KAFKA-9515:
---

ijuma commented on pull request #8125: KAFKA-9515: Upgrade ZooKeeper to 3.5.7
URL: https://github.com/apache/kafka/pull/8125
 
 
   A couple of critical fixes:
   
   ZOOKEEPER-3644: Data loss after upgrading standalone ZK server 3.4.14 to 
3.5.6 with snapshot.trust.empty=true
   ZOOKEEPER-3701: Split brain on log disk full (3.5) 
   
   Full release notes:
   
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801=12346098
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade ZooKeeper to 3.5.7
> --
>
> Key: KAFKA-9515
> URL: https://issues.apache.org/jira/browse/KAFKA-9515
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 2.5.0, 2.4.1
>
>
> There are some critical fixes in ZK 3.5.7 and the first RC has been posted:
> [https://mail-archives.apache.org/mod_mbox/zookeeper-dev/202002.mbox/%3cCAGH6_KiULzemT-V4x_2ybWeKLMvQ+eh=q-dzsiz8a-ypp5t...@mail.gmail.com%3e]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9137) Maintenance of FetchSession cache causing FETCH_SESSION_ID_NOT_FOUND in live sessions

2020-02-15 Thread Lucas Bradstreet (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucas Bradstreet resolved KAFKA-9137.
-
Resolution: Fixed

Closed by [https://github.com/apache/kafka/pull/7640]

> Maintenance of FetchSession cache causing FETCH_SESSION_ID_NOT_FOUND in live 
> sessions
> -
>
> Key: KAFKA-9137
> URL: https://issues.apache.org/jira/browse/KAFKA-9137
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Lucas Bradstreet
>Priority: Major
>
> We have recently seen cases where brokers end up in a bad state where fetch 
> session evictions occur at a high rate (> 16 per second) after a roll. This 
> increase in eviction rate included the following pattern in our logs:
>  
> {noformat}
> broker 6: October 31st 2019, 17:52:45.496 Created a new incremental 
> FetchContext for session id 2046264334, epoch 9790: added (), updated (), 
> removed ()
> broker 6: October 31st 2019, 17:52:45.496 Created a new incremental 
> FetchContext for session id 2046264334, epoch 9791: added (), updated (), 
> removed () broker 6: October 31st 2019, 17:52:45.500 Created a new 
> incremental FetchContext for session id 2046264334, epoch 9792: added (), 
> updated (lkc-7nv6o_tenant_soak_topic_144p-67), removed () 
> broker 6: October 31st 2019, 17:52:45.501 Created a new incremental 
> FetchContext for session id 2046264334, epoch 9793: added (), updated 
> (lkc-7nv6o_tenant_soak_topic_144p-59, lkc-7nv6o_tenant_soak_topic_144p-123, 
> lkc-7nv6o_tenant_soak_topic_144p-11, lkc-7nv6o_tenant_soak_topic_144p-3, 
> lkc-7nv6o_tenant_soak_topic_144p-67, lkc-7nv6o_tenant_soak_topic_144p-115), 
> removed () 
> broker 6: October 31st 2019, 17:52:45.501 Evicting stale FetchSession 
> 2046264334. 
> broker 6: October 31st 2019, 17:52:45.502 Session error for 2046264334: no 
> such session ID found. 
> broker 4: October 31st 2019, 17:52:45.813 [ReplicaFetcher replicaId=4, 
> leaderId=6, fetcherId=0] Node 6 was unable to process the fetch request with 
> (sessionId=2046264334, epoch=9793): FETCH_SESSION_ID_NOT_FOUND.  
> {noformat}
> This pattern appears to be problematic for two reasons. Firstly, the replica 
> fetcher for broker 4 was clearly able to send multiple incremental fetch 
> requests to broker 6, and receive replies, and did so right up to the point 
> where broker 6 evicted its fetch session within milliseconds of multiple 
> fetch requests. The second problem is that replica fetchers are considered 
> privileged for the fetch session cache, and should not be evicted by consumer 
> fetch sessions. This cluster only has 12 brokers and 1000 fetch session cache 
> slots (the default for max.incremental.fetch.session.cache.slots), and it 
> thus very unlikely that this session should have been evicted by another 
> replica fetcher session.
> This cluster also appears to be causing cycles of fetch session evictions 
> where the cluster never stabilizes into a state where fetch sessions are not 
> evicted. The above logs are the best example I could find of a case where a 
> session clearly should not have been evicted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9545) Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`

2020-02-15 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037550#comment-17037550
 ] 

Matthias J. Sax commented on KAFKA-9545:


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/4733/testReport/junit/org.apache.kafka.streams.integration/RegexSourceIntegrationTest/testRegexMatchesTopicsAWhenDeleted/]

> Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`
> --
>
> Key: KAFKA-9545
> URL: https://issues.apache.org/jira/browse/KAFKA-9545
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>
> https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/4678/testReport/org.apache.kafka.streams.integration/RegexSourceIntegrationTest/testRegexMatchesTopicsAWhenDeleted/
> {code}
> java.lang.AssertionError: Condition not met within timeout 15000. Stream 
> tasks not updated
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
>   at 
> org.apache.kafka.test.TestUtils.lambda$waitForCondition$5(TestUtils.java:367)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:415)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:383)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:366)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:337)
>   at 
> org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted(RegexSourceIntegrationTest.java:224)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9545) Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`

2020-02-15 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037549#comment-17037549
 ] 

Matthias J. Sax commented on KAFKA-9545:


[https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/720/testReport/junit/org.apache.kafka.streams.integration/RegexSourceIntegrationTest/testRegexMatchesTopicsAWhenDeleted/]

> Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`
> --
>
> Key: KAFKA-9545
> URL: https://issues.apache.org/jira/browse/KAFKA-9545
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>
> https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/4678/testReport/org.apache.kafka.streams.integration/RegexSourceIntegrationTest/testRegexMatchesTopicsAWhenDeleted/
> {code}
> java.lang.AssertionError: Condition not met within timeout 15000. Stream 
> tasks not updated
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
>   at 
> org.apache.kafka.test.TestUtils.lambda$waitForCondition$5(TestUtils.java:367)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:415)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:383)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:366)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:337)
>   at 
> org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted(RegexSourceIntegrationTest.java:224)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9545) Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`

2020-02-15 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037547#comment-17037547
 ] 

Matthias J. Sax commented on KAFKA-9545:


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/4734/testReport/org.apache.kafka.streams.integration/RegexSourceIntegrationTest/testRegexMatchesTopicsAWhenDeleted/]

> Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`
> --
>
> Key: KAFKA-9545
> URL: https://issues.apache.org/jira/browse/KAFKA-9545
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>
> https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/4678/testReport/org.apache.kafka.streams.integration/RegexSourceIntegrationTest/testRegexMatchesTopicsAWhenDeleted/
> {code}
> java.lang.AssertionError: Condition not met within timeout 15000. Stream 
> tasks not updated
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
>   at 
> org.apache.kafka.test.TestUtils.lambda$waitForCondition$5(TestUtils.java:367)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:415)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:383)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:366)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:337)
>   at 
> org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted(RegexSourceIntegrationTest.java:224)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9541) Flaky Test DescribeConsumerGroupTest#testDescribeGroupMembersWithShortInitializationTimeout

2020-02-15 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037546#comment-17037546
 ] 

Matthias J. Sax commented on KAFKA-9541:


[https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/721/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeGroupMembersWithShortInitializationTimeout/]

Different stack trace:
{quote}java.lang.AssertionError: expected: but was: at 
org.junit.Assert.fail(Assert.java:89) at 
org.junit.Assert.failNotEquals(Assert.java:835) at 
org.junit.Assert.assertEquals(Assert.java:120) at 
org.junit.Assert.assertEquals(Assert.java:146) at 
kafka.admin.DescribeConsumerGroupTest.testDescribeGroupMembersWithShortInitializationTimeout(DescribeConsumerGroupTest.scala:630){quote}

> Flaky Test 
> DescribeConsumerGroupTest#testDescribeGroupMembersWithShortInitializationTimeout
> ---
>
> Key: KAFKA-9541
> URL: https://issues.apache.org/jira/browse/KAFKA-9541
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.4.0
>Reporter: huxihx
>Assignee: huxihx
>Priority: Major
>
> h3. Error Message
> java.lang.AssertionError: assertion failed
> h3. Stacktrace
> java.lang.AssertionError: assertion failed at 
> scala.Predef$.assert(Predef.scala:267) at 
> kafka.admin.DescribeConsumerGroupTest.testDescribeGroupMembersWithShortInitializationTimeout(DescribeConsumerGroupTest.scala:630)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413) at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>  at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>  at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>  at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>  at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>  at jdk.internal.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
>  at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>  at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
>  at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
>  at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at 
> 

[jira] [Commented] (KAFKA-9563) Fix Kafka connect consumer and producer override documentation

2020-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037518#comment-17037518
 ] 

ASF GitHub Bot commented on KAFKA-9563:
---

blcksrx commented on pull request #8124: [KAFKA-9563] Fix Kafka connect 
consumer and producer override docs
URL: https://github.com/apache/kafka/pull/8124
 
 
   The true parameters for overriding producer config or consumer config in 
**Kafka-Connect** are  
   `producer.override.[PRODUCER-CONFIG] ` or 
`consumer.override.[CONSUMER-CONFIG]`
   
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix Kafka connect consumer and producer override documentation
> --
>
> Key: KAFKA-9563
> URL: https://issues.apache.org/jira/browse/KAFKA-9563
> Project: Kafka
>  Issue Type: Bug
>  Components: docs, documentation, KafkaConnect
>Affects Versions: 2.3.1
>Reporter: Sayed Mohammad Hossein Torabi
>Priority: Minor
>
> The true parameters for overriding producer config or consumer config in 
> *Kafka-Connect* are  
> {code:java}
> producer.override.[PRODUCER-CONFIG] {code}
> or
> {code:java}
> consumer.override.[CONSUMER-CONFIG]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9563) Fix Kafka connect consumer and producer override documentation

2020-02-15 Thread Sayed Mohammad Hossein Torabi (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sayed Mohammad Hossein Torabi updated KAFKA-9563:
-
Summary: Fix Kafka connect consumer and producer override documentation  
(was: Fix consumer and producer override documentation)

> Fix Kafka connect consumer and producer override documentation
> --
>
> Key: KAFKA-9563
> URL: https://issues.apache.org/jira/browse/KAFKA-9563
> Project: Kafka
>  Issue Type: Bug
>  Components: docs, documentation, KafkaConnect
>Affects Versions: 2.3.1
>Reporter: Sayed Mohammad Hossein Torabi
>Priority: Minor
>
> The true parameters for overriding producer config or consumer config in 
> *Kafka-Connect* are  
> {code:java}
> producer.override.[PRODUCER-CONFIG] {code}
> or
> {code:java}
> consumer.override.[CONSUMER-CONFIG]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9563) Fix consumer and producer override documentation

2020-02-15 Thread Sayed Mohammad Hossein Torabi (Jira)
Sayed Mohammad Hossein Torabi created KAFKA-9563:


 Summary: Fix consumer and producer override documentation
 Key: KAFKA-9563
 URL: https://issues.apache.org/jira/browse/KAFKA-9563
 Project: Kafka
  Issue Type: Bug
  Components: docs, documentation, KafkaConnect
Affects Versions: 2.3.1
Reporter: Sayed Mohammad Hossein Torabi


The true parameters for overriding producer config or consumer config in 
*Kafka-Connect* are  
{code:java}
producer.override.[PRODUCER-CONFIG] {code}
or
{code:java}
consumer.override.[CONSUMER-CONFIG]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8507) Support --bootstrap-server in all command line tools

2020-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037491#comment-17037491
 ] 

ASF GitHub Bot commented on KAFKA-8507:
---

stanislavkozlovski commented on pull request #8123: KAFKA-8507: Unify 
bootstrap-server flag for command line tools (KIP-499) part-2
URL: https://github.com/apache/kafka/pull/8123
 
 
   This patch updates ReplicaVerificationTool to add and prefer the 
--bootstrap-server flag for defining the connection point of the Kafka cluster. 
This change is part of KIP-499: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-499+-+Unify+connection+name+flag+for+command+line+tool.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support --bootstrap-server in all command line tools
> 
>
> Key: KAFKA-8507
> URL: https://issues.apache.org/jira/browse/KAFKA-8507
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jason Gustafson
>Assignee: Mitchell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.5.0
>
>
> This is a unambitious initial move toward standardizing the command line 
> tools. We have favored the name {{\-\-bootstrap-server}} in all new tools 
> since it matches the config {{bootstrap.server}} which is used by all 
> clients. Some older commands use {{\-\-broker-list}} or 
> {{\-\-bootstrap-servers}} and maybe other exotic variations. We should 
> support {{\-\-bootstrap-server}} in all commands and deprecate the other 
> options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)