[jira] [Updated] (KAFKA-6266) Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of __consumer_offsets-xx to log start offset 203569 since the checkpointed offset 120955 is invalid
[ https://issues.apache.org/jira/browse/KAFKA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-6266: --- Fix Version/s: 2.4.1 2.5.0 > Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of > __consumer_offsets-xx to log start offset 203569 since the checkpointed > offset 120955 is invalid. (kafka.log.LogCleanerManager$) > -- > > Key: KAFKA-6266 > URL: https://issues.apache.org/jira/browse/KAFKA-6266 > Project: Kafka > Issue Type: Bug > Components: offset manager >Affects Versions: 1.0.0, 1.0.1 > Environment: CentOS 7, Apache kafka_2.12-1.0.0 >Reporter: VinayKumar >Priority: Major > Fix For: 2.5.0, 2.4.1 > > > I upgraded Kafka from 0.10.2.1 to 1.0.0 version. From then, I see the below > warnings in the log. > I'm seeing these continuously in the log, and want these to be fixed- so > that they wont repeat. Can someone please help me in fixing the below > warnings. > {code} > WARN Resetting first dirty offset of __consumer_offsets-17 to log start > offset 3346 since the checkpointed offset 3332 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-23 to log start > offset 4 since the checkpointed offset 1 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-19 to log start > offset 203569 since the checkpointed offset 120955 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-35 to log start > offset 16957 since the checkpointed offset 7 is invalid. > (kafka.log.LogCleanerManager$) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-6266) Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of __consumer_offsets-xx to log start offset 203569 since the checkpointed offset 120955 is inval
[ https://issues.apache.org/jira/browse/KAFKA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987550#comment-16987550 ] Jun Rao commented on KAFKA-6266: Looking at the code where the "Resetting first dirty offset " warning is emitted. It seems that when we only return the new cleaning offset based on log start offset but not writing the new value to the cleaner offset checkpoint file. So, it the case when there is nothing to clean, this warning will repeat every time the log cleaner checks for logs to clean. To fix this, we should overwrite the cleaner offset checkpoint file as well. > Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of > __consumer_offsets-xx to log start offset 203569 since the checkpointed > offset 120955 is invalid. (kafka.log.LogCleanerManager$) > -- > > Key: KAFKA-6266 > URL: https://issues.apache.org/jira/browse/KAFKA-6266 > Project: Kafka > Issue Type: Bug > Components: offset manager >Affects Versions: 1.0.0, 1.0.1 > Environment: CentOS 7, Apache kafka_2.12-1.0.0 >Reporter: VinayKumar >Priority: Major > > I upgraded Kafka from 0.10.2.1 to 1.0.0 version. From then, I see the below > warnings in the log. > I'm seeing these continuously in the log, and want these to be fixed- so > that they wont repeat. Can someone please help me in fixing the below > warnings. > {code} > WARN Resetting first dirty offset of __consumer_offsets-17 to log start > offset 3346 since the checkpointed offset 3332 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-23 to log start > offset 4 since the checkpointed offset 1 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-19 to log start > offset 203569 since the checkpointed offset 120955 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-35 to log start > offset 16957 since the checkpointed offset 7 is invalid. > (kafka.log.LogCleanerManager$) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9261) NPE when updating client metadata
[ https://issues.apache.org/jira/browse/KAFKA-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-9261: --- Fix Version/s: 2.3.2 2.4.0 > NPE when updating client metadata > - > > Key: KAFKA-9261 > URL: https://issues.apache.org/jira/browse/KAFKA-9261 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 2.4.0, 2.3.2 > > > We have seen the following exception recently: > {code} > java.lang.NullPointerException > at java.base/java.util.Objects.requireNonNull(Objects.java:221) > at org.apache.kafka.common.Cluster.(Cluster.java:134) > at org.apache.kafka.common.Cluster.(Cluster.java:89) > at > org.apache.kafka.clients.MetadataCache.computeClusterView(MetadataCache.java:120) > at org.apache.kafka.clients.MetadataCache.(MetadataCache.java:82) > at org.apache.kafka.clients.MetadataCache.(MetadataCache.java:58) > at > org.apache.kafka.clients.Metadata.handleMetadataResponse(Metadata.java:325) > at org.apache.kafka.clients.Metadata.update(Metadata.java:252) > at > org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.handleCompletedMetadataResponse(NetworkClient.java:1059) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:845) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:548) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1281) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > {code} > The client assumes that if a leader is included in the response, then node > information must also be available. There are at least a couple possible > reasons this assumption can fail: > 1. The client is able to detect stale partition metadata using leader epoch > information available. If stale partition metadata is detected, the client > ignores it and uses the last known metadata. However, it cannot detect stale > broker information and will always accept the latest update. This means that > the latest metadata may be a mix of multiple metadata responses and therefore > the invariant will not generally hold. > 2. There is no lock which protects both the fetching of partition metadata > and the live broker when handling a Metadata request. This means an > UpdateMetadata request can arrive concurrently and break the intended > invariant. > It seems case 2 has been possible for a long time, but it should be extremely > rare. Case 1 was only made possible with KIP-320, which added the leader > epoch tracking. It should also be rare, but the window for inconsistent > metadata is probably a bit bigger than the window for a concurrent update. > To fix this, we should make the client more defensive about metadata updates > and not assume that the leader is among the live endpoints. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9258) Connect ConnectorStatusMetricsGroup sometimes NPE
[ https://issues.apache.org/jira/browse/KAFKA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch reassigned KAFKA-9258: Assignee: Cyrus Vafadari > Connect ConnectorStatusMetricsGroup sometimes NPE > - > > Key: KAFKA-9258 > URL: https://issues.apache.org/jira/browse/KAFKA-9258 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0, 2.5.0 >Reporter: Cyrus Vafadari >Assignee: Cyrus Vafadari >Priority: Blocker > Fix For: 2.4.0, 2.5.0 > > > java.lang.NullPointerException > at > org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:901) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTask(Worker.java:720) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTasks(Worker.java:740) > at > org.apache.kafka.connect.runtime.Worker.stopAndAwaitTasks(Worker.java:758) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.processTaskConfigUpdatesWithIncrementalCooperative(DistributedHerder.java:575) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:396) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:282) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) > at java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-9258) Connect ConnectorStatusMetricsGroup sometimes NPE
[ https://issues.apache.org/jira/browse/KAFKA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch resolved KAFKA-9258. -- Reviewer: Randall Hauch Resolution: Fixed Merged into `trunk` and backported to the `2.4` branch after this was approved as a blocker for AK 2.4.0 > Connect ConnectorStatusMetricsGroup sometimes NPE > - > > Key: KAFKA-9258 > URL: https://issues.apache.org/jira/browse/KAFKA-9258 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0, 2.5.0 >Reporter: Cyrus Vafadari >Assignee: Cyrus Vafadari >Priority: Blocker > Fix For: 2.4.0, 2.5.0 > > > java.lang.NullPointerException > at > org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:901) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTask(Worker.java:720) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTasks(Worker.java:740) > at > org.apache.kafka.connect.runtime.Worker.stopAndAwaitTasks(Worker.java:758) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.processTaskConfigUpdatesWithIncrementalCooperative(DistributedHerder.java:575) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:396) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:282) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) > at java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9261) NPE when updating client metadata
[ https://issues.apache.org/jira/browse/KAFKA-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987487#comment-16987487 ] ASF GitHub Bot commented on KAFKA-9261: --- ijuma commented on pull request #7772: KAFKA-9261; Client should handle inconsistent leader metadata URL: https://github.com/apache/kafka/pull/7772 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > NPE when updating client metadata > - > > Key: KAFKA-9261 > URL: https://issues.apache.org/jira/browse/KAFKA-9261 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > > We have seen the following exception recently: > {code} > java.lang.NullPointerException > at java.base/java.util.Objects.requireNonNull(Objects.java:221) > at org.apache.kafka.common.Cluster.(Cluster.java:134) > at org.apache.kafka.common.Cluster.(Cluster.java:89) > at > org.apache.kafka.clients.MetadataCache.computeClusterView(MetadataCache.java:120) > at org.apache.kafka.clients.MetadataCache.(MetadataCache.java:82) > at org.apache.kafka.clients.MetadataCache.(MetadataCache.java:58) > at > org.apache.kafka.clients.Metadata.handleMetadataResponse(Metadata.java:325) > at org.apache.kafka.clients.Metadata.update(Metadata.java:252) > at > org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.handleCompletedMetadataResponse(NetworkClient.java:1059) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:845) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:548) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1281) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > {code} > The client assumes that if a leader is included in the response, then node > information must also be available. There are at least a couple possible > reasons this assumption can fail: > 1. The client is able to detect stale partition metadata using leader epoch > information available. If stale partition metadata is detected, the client > ignores it and uses the last known metadata. However, it cannot detect stale > broker information and will always accept the latest update. This means that > the latest metadata may be a mix of multiple metadata responses and therefore > the invariant will not generally hold. > 2. There is no lock which protects both the fetching of partition metadata > and the live broker when handling a Metadata request. This means an > UpdateMetadata request can arrive concurrently and break the intended > invariant. > It seems case 2 has been possible for a long time, but it should be extremely > rare. Case 1 was only made possible with KIP-320, which added the leader > epoch tracking. It should also be rare, but the window for inconsistent > metadata is probably a bit bigger than the window for a concurrent update. > To fix this, we should make the client more defensive about metadata updates > and not assume that the leader is among the live endpoints. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9088) Consolidate InternalMockProcessorContext and MockInternalProcessorContext
[ https://issues.apache.org/jira/browse/KAFKA-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987478#comment-16987478 ] John Roesler commented on KAFKA-9088: - Hey all, I can see I've missed some discussions here. I've read over the ticket and also looked over the highlights of the PR. I'm afraid I really don't think that EasyMock is going to be a good approach here. It's a great library for when you need to mock things that are painful to implement, or when you need to play through different specific scenarios in different tests. But at the same time, it can be a major pain to work with. Tracing the execution with the debugger is virtually impossible: as soon as you accidentally descend into the mock, you're deep into Java black magic. Figuring out when to set up expectations and when to replay them is deeply confusing. And paradoxically, just setting up a mock to consistently behave in a "normal" way takes an obnoxious amount of code. My experience says that if what you're trying to mock has a relatively simple interface and state space, then just go ahead and implement a "manually mocked" implementation. Only reach for EasyMock when this basic approach is too difficult for some reason. If I understand this ticket, the motivation was _only_ that there exist two similar classes in the Apache codebase, not that there is anything actually wrong with either of them. In that case, we shouldn't switch to EasyMock. We should just consolidate the implementations... Or not! I didn't see any actual justification for _why_ it's a problem to have two similar classes, and if consolidating them results in a lot of difficulty, or a lot of extra code, then consolidating is the wrong thing to do. Either way, Just looking at the PR, it seems that this _consolidation_ strategy results in adding _double_ the lines of code that we're removing. I know that this must have taken a ton of work, so I've been sitting here trying to thing of something else to say, but it really feels like we need to go back to the drawing board here and ask ourselves what we're trying to accomplish, and at what cost. Thanks, John > Consolidate InternalMockProcessorContext and MockInternalProcessorContext > - > > Key: KAFKA-9088 > URL: https://issues.apache.org/jira/browse/KAFKA-9088 > Project: Kafka > Issue Type: Improvement > Components: streams, unit tests >Reporter: Bruno Cadonna >Assignee: Pierangelo Di Pilato >Priority: Minor > Labels: newbie > > Currently, we have two mocks for the {{InternalProcessorContext}}. The goal > of this ticket is to merge both into one mock or replace it with an > {{EasyMock}} mock. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9258) Connect ConnectorStatusMetricsGroup sometimes NPE
[ https://issues.apache.org/jira/browse/KAFKA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9258: - Fix Version/s: 2.5.0 > Connect ConnectorStatusMetricsGroup sometimes NPE > - > > Key: KAFKA-9258 > URL: https://issues.apache.org/jira/browse/KAFKA-9258 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0 >Reporter: Cyrus Vafadari >Priority: Blocker > Fix For: 2.4.0, 2.5.0 > > > java.lang.NullPointerException > at > org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:901) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTask(Worker.java:720) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTasks(Worker.java:740) > at > org.apache.kafka.connect.runtime.Worker.stopAndAwaitTasks(Worker.java:758) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.processTaskConfigUpdatesWithIncrementalCooperative(DistributedHerder.java:575) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:396) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:282) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) > at java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9258) Connect ConnectorStatusMetricsGroup sometimes NPE
[ https://issues.apache.org/jira/browse/KAFKA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-9258: - Affects Version/s: 2.5.0 > Connect ConnectorStatusMetricsGroup sometimes NPE > - > > Key: KAFKA-9258 > URL: https://issues.apache.org/jira/browse/KAFKA-9258 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0, 2.5.0 >Reporter: Cyrus Vafadari >Priority: Blocker > Fix For: 2.4.0, 2.5.0 > > > java.lang.NullPointerException > at > org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:901) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTask(Worker.java:720) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTasks(Worker.java:740) > at > org.apache.kafka.connect.runtime.Worker.stopAndAwaitTasks(Worker.java:758) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.processTaskConfigUpdatesWithIncrementalCooperative(DistributedHerder.java:575) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:396) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:282) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) > at java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9258) Connect ConnectorStatusMetricsGroup sometimes NPE
[ https://issues.apache.org/jira/browse/KAFKA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987477#comment-16987477 ] ASF GitHub Bot commented on KAFKA-9258: --- rhauch commented on pull request #7768: KAFKA-9258 Check Connect Metrics non-null in task stop URL: https://github.com/apache/kafka/pull/7768 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Connect ConnectorStatusMetricsGroup sometimes NPE > - > > Key: KAFKA-9258 > URL: https://issues.apache.org/jira/browse/KAFKA-9258 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0 >Reporter: Cyrus Vafadari >Priority: Blocker > Fix For: 2.4.0 > > > java.lang.NullPointerException > at > org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:901) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTask(Worker.java:720) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTasks(Worker.java:740) > at > org.apache.kafka.connect.runtime.Worker.stopAndAwaitTasks(Worker.java:758) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.processTaskConfigUpdatesWithIncrementalCooperative(DistributedHerder.java:575) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:396) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:282) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) > at java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9073) Kafka Streams State stuck in rebalancing after one of the StreamThread encounters java.lang.IllegalStateException: No current assignment for partition
[ https://issues.apache.org/jira/browse/KAFKA-9073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987450#comment-16987450 ] ASF GitHub Bot commented on KAFKA-9073: --- guozhangwang commented on pull request #7630: KAFKA-9073: check assignment in requestFailed to avoid NPE URL: https://github.com/apache/kafka/pull/7630 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Kafka Streams State stuck in rebalancing after one of the StreamThread > encounters java.lang.IllegalStateException: No current assignment for > partition > -- > > Key: KAFKA-9073 > URL: https://issues.apache.org/jira/browse/KAFKA-9073 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.3.0 >Reporter: amuthan Ganeshan >Priority: Major > Fix For: 2.4.0 > > Attachments: KAFKA-9073.log > > > I have a Kafka stream application that stores the incoming messages into a > state store, and later during the punctuation period, we store them into a > big data persistent store after processing the messages. > The application consumes from 120 partitions distributed across 40 instances. > The application has been running fine without any problem for months, but all > of a sudden some of the instances failed because of a stream thread exception > saying > ```java.lang.IllegalStateException: No current assignment for partition > --changelog-98``` > > And other instances are stuck in the REBALANCING state, and never comes out > of it. Here is the full stack trace, I just masked the application-specific > app name and store name in the stack trace due to NDA. > > ``` > 2019-10-21 13:27:13,481 ERROR > [application.id-a2c06c51-bfc7-449a-a094-d8b770caee92-StreamThread-3] > [org.apache.kafka.streams.processor.internals.StreamThread] [] stream-thread > [application.id-a2c06c51-bfc7-449a-a094-d8b770caee92-StreamThread-3] > Encountered the following error during processing: > java.lang.IllegalStateException: No current assignment for partition > application.id-store_name-changelog-98 > at > org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedState(SubscriptionState.java:319) > at > org.apache.kafka.clients.consumer.internals.SubscriptionState.requestFailed(SubscriptionState.java:618) > at > org.apache.kafka.clients.consumer.internals.Fetcher$2.onFailure(Fetcher.java:709) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.fireFailure(RequestFuture.java:177) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.raise(RequestFuture.java:147) > at > org.apache.kafka.clients.consumer.internals.RequestFutureAdapter.onFailure(RequestFutureAdapter.java:30) > at > org.apache.kafka.clients.consumer.internals.RequestFuture$1.onFailure(RequestFuture.java:209) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.fireFailure(RequestFuture.java:177) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.raise(RequestFuture.java:147) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:574) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:388) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:294) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1281) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeUpdateStandbyTasks(StreamThread.java:1126) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:923) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774) > ``` > > Now I checked the state sore disk usage; it is less than 40% of the total > disk space available. Restarting the application solves the problem for a > short amount of time, but the error popping up randomly on some other > instances quickly.
[jira] [Commented] (KAFKA-9173) StreamsPartitionAssignor assigns partitions to only one worker
[ https://issues.apache.org/jira/browse/KAFKA-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987420#comment-16987420 ] Guozhang Wang commented on KAFKA-9173: -- [~o.muravskiy] Note that the num.tasks are not equal to num.partitions, as many partitions of different topics can map to one task. Although I did not see your source code, from the logs I think 21 topics are mapped together into one sub-topology (if you do not understand the concept of sub-topology you can read it in the web docs here: https://docs.confluent.io/current/streams/architecture.html), and hence you still only have 10 tasks instead of 210 tasks. > StreamsPartitionAssignor assigns partitions to only one worker > -- > > Key: KAFKA-9173 > URL: https://issues.apache.org/jira/browse/KAFKA-9173 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.3.0, 2.2.1 >Reporter: Oleg Muravskiy >Priority: Major > Labels: user-experience > Attachments: StreamsPartitionAssignor.log > > > I'm running a distributed KafkaStreams application on 10 worker nodes, > subscribed to 21 topics with 10 partitions in each. I'm only using a > Processor interface, and a persistent state store. > However, only one worker gets assigned partitions, all other workers get > nothing. Restarting the application, or cleaning local state stores does not > help. StreamsPartitionAssignor migrates to other nodes, and eventually picks > up other node to assign partitions to, but still only one node. > It's difficult to figure out where to look for the signs of problems, I'm > attaching the log messages from the StreamsPartitionAssignor. Let me know > what else I could provide to help resolve this. > [^StreamsPartitionAssignor.log] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9173) StreamsPartitionAssignor assigns partitions to only one worker
[ https://issues.apache.org/jira/browse/KAFKA-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987410#comment-16987410 ] Sophie Blee-Goldman commented on KAFKA-9173: What exactly do you mean by "subscribed to 21 topics"? For example are you using pattern subscription and have 21 input topics you expect to match, or are you using 21 different topics in the same topology? > StreamsPartitionAssignor assigns partitions to only one worker > -- > > Key: KAFKA-9173 > URL: https://issues.apache.org/jira/browse/KAFKA-9173 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.3.0, 2.2.1 >Reporter: Oleg Muravskiy >Priority: Major > Labels: user-experience > Attachments: StreamsPartitionAssignor.log > > > I'm running a distributed KafkaStreams application on 10 worker nodes, > subscribed to 21 topics with 10 partitions in each. I'm only using a > Processor interface, and a persistent state store. > However, only one worker gets assigned partitions, all other workers get > nothing. Restarting the application, or cleaning local state stores does not > help. StreamsPartitionAssignor migrates to other nodes, and eventually picks > up other node to assign partitions to, but still only one node. > It's difficult to figure out where to look for the signs of problems, I'm > attaching the log messages from the StreamsPartitionAssignor. Let me know > what else I could provide to help resolve this. > [^StreamsPartitionAssignor.log] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KAFKA-9231) Streams Threads may die from recoverable errors with EOS enabled
[ https://issues.apache.org/jira/browse/KAFKA-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987395#comment-16987395 ] John Roesler edited comment on KAFKA-9231 at 12/4/19 1:14 AM: -- Fixed in 2.4 by https://github.com/apache/kafka/commit/d7fe494b2a983256092bcc50eac8eab8eb8a6163 And in trunk by https://github.com/apache/kafka/commit/18c13d38ed7090801f125088e3eaec7e3c85c09d was (Author: vvcephei): Fixed in 2.4 by https://github.com/apache/kafka/commit/d7fe494b2a983256092bcc50eac8eab8eb8a6163 > Streams Threads may die from recoverable errors with EOS enabled > > > Key: KAFKA-9231 > URL: https://issues.apache.org/jira/browse/KAFKA-9231 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.4.0 >Reporter: John Roesler >Assignee: John Roesler >Priority: Blocker > Fix For: 2.4.0 > > > While testing Streams in EOS mode under frequent and heavy network > partitions, I've encountered the following error, leading to thread death: > {noformat} > [2019-11-26 04:54:02,650] ERROR > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > Encountered the following unexpected Kafka exception during processing, this > usually indicate Streams internal errors: > (org.apache.kafka.streams.processor.internals.StreamThread) > org.apache.kafka.streams.errors.StreamsException: stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] Failed > to rebalance. > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:852) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:739) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:698) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:671) > Caused by: org.apache.kafka.streams.errors.StreamsException: stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] failed > to suspend stream tasks > at > org.apache.kafka.streams.processor.internals.TaskManager.suspendActiveTasksAndState(TaskManager.java:253) > at > org.apache.kafka.streams.processor.internals.StreamsRebalanceListener.onPartitionsRevoked(StreamsRebalanceListener.java:116) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsRevoked(ConsumerCoordinator.java:291) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onLeavePrepare(ConsumerCoordinator.java:707) > at > org.apache.kafka.clients.consumer.KafkaConsumer.unsubscribe(KafkaConsumer.java:1073) > at > org.apache.kafka.streams.processor.internals.StreamThread.enforceRebalance(StreamThread.java:716) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:710) > ... 1 more > Caused by: org.apache.kafka.streams.errors.ProcessorStateException: task > [1_1] Failed to flush state store KSTREAM-AGGREGATE-STATE-STORE-07 > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:279) > at > org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:175) > at > org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:581) > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:535) > at > org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:660) > at > org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:628) > at > org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.suspendRunningTasks(AssignedStreamsTasks.java:145) > at > org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.suspendOrCloseTasks(AssignedStreamsTasks.java:128) > at > org.apache.kafka.streams.processor.internals.TaskManager.suspendActiveTasksAndState(TaskManager.java:246) > ... 7 more > Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer > attempted an operation with an old epoch. Either there is a newer producer > with the same transactionalId, or the producer's transaction has been expired > by the broker. > [2019-11-26 04:54:02,650] INFO > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] State > transition from PARTITIONS_REVOKED to PENDING_SHUTDOWN >
[jira] [Resolved] (KAFKA-9231) Streams Threads may die from recoverable errors with EOS enabled
[ https://issues.apache.org/jira/browse/KAFKA-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler resolved KAFKA-9231. - Resolution: Fixed Fixed in 2.4 by https://github.com/apache/kafka/commit/d7fe494b2a983256092bcc50eac8eab8eb8a6163 > Streams Threads may die from recoverable errors with EOS enabled > > > Key: KAFKA-9231 > URL: https://issues.apache.org/jira/browse/KAFKA-9231 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.4.0 >Reporter: John Roesler >Assignee: John Roesler >Priority: Blocker > Fix For: 2.4.0 > > > While testing Streams in EOS mode under frequent and heavy network > partitions, I've encountered the following error, leading to thread death: > {noformat} > [2019-11-26 04:54:02,650] ERROR > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > Encountered the following unexpected Kafka exception during processing, this > usually indicate Streams internal errors: > (org.apache.kafka.streams.processor.internals.StreamThread) > org.apache.kafka.streams.errors.StreamsException: stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] Failed > to rebalance. > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:852) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:739) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:698) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:671) > Caused by: org.apache.kafka.streams.errors.StreamsException: stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] failed > to suspend stream tasks > at > org.apache.kafka.streams.processor.internals.TaskManager.suspendActiveTasksAndState(TaskManager.java:253) > at > org.apache.kafka.streams.processor.internals.StreamsRebalanceListener.onPartitionsRevoked(StreamsRebalanceListener.java:116) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsRevoked(ConsumerCoordinator.java:291) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onLeavePrepare(ConsumerCoordinator.java:707) > at > org.apache.kafka.clients.consumer.KafkaConsumer.unsubscribe(KafkaConsumer.java:1073) > at > org.apache.kafka.streams.processor.internals.StreamThread.enforceRebalance(StreamThread.java:716) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:710) > ... 1 more > Caused by: org.apache.kafka.streams.errors.ProcessorStateException: task > [1_1] Failed to flush state store KSTREAM-AGGREGATE-STATE-STORE-07 > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:279) > at > org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:175) > at > org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:581) > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:535) > at > org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:660) > at > org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:628) > at > org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.suspendRunningTasks(AssignedStreamsTasks.java:145) > at > org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.suspendOrCloseTasks(AssignedStreamsTasks.java:128) > at > org.apache.kafka.streams.processor.internals.TaskManager.suspendActiveTasksAndState(TaskManager.java:246) > ... 7 more > Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer > attempted an operation with an old epoch. Either there is a newer producer > with the same transactionalId, or the producer's transaction has been expired > by the broker. > [2019-11-26 04:54:02,650] INFO > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] State > transition from PARTITIONS_REVOKED to PENDING_SHUTDOWN > (org.apache.kafka.streams.processor.internals.StreamThread) > [2019-11-26 04:54:02,650] INFO > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > Shutting down
[jira] [Commented] (KAFKA-9231) Streams Threads may die from recoverable errors with EOS enabled
[ https://issues.apache.org/jira/browse/KAFKA-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987393#comment-16987393 ] ASF GitHub Bot commented on KAFKA-9231: --- vvcephei commented on pull request #7748: KAFKA-9231: Streams Threads may die from recoverable errors with EOS enabled URL: https://github.com/apache/kafka/pull/7748 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Streams Threads may die from recoverable errors with EOS enabled > > > Key: KAFKA-9231 > URL: https://issues.apache.org/jira/browse/KAFKA-9231 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.4.0 >Reporter: John Roesler >Assignee: John Roesler >Priority: Blocker > Fix For: 2.4.0 > > > While testing Streams in EOS mode under frequent and heavy network > partitions, I've encountered the following error, leading to thread death: > {noformat} > [2019-11-26 04:54:02,650] ERROR > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > Encountered the following unexpected Kafka exception during processing, this > usually indicate Streams internal errors: > (org.apache.kafka.streams.processor.internals.StreamThread) > org.apache.kafka.streams.errors.StreamsException: stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] Failed > to rebalance. > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:852) > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:739) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:698) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:671) > Caused by: org.apache.kafka.streams.errors.StreamsException: stream-thread > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] failed > to suspend stream tasks > at > org.apache.kafka.streams.processor.internals.TaskManager.suspendActiveTasksAndState(TaskManager.java:253) > at > org.apache.kafka.streams.processor.internals.StreamsRebalanceListener.onPartitionsRevoked(StreamsRebalanceListener.java:116) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsRevoked(ConsumerCoordinator.java:291) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onLeavePrepare(ConsumerCoordinator.java:707) > at > org.apache.kafka.clients.consumer.KafkaConsumer.unsubscribe(KafkaConsumer.java:1073) > at > org.apache.kafka.streams.processor.internals.StreamThread.enforceRebalance(StreamThread.java:716) > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:710) > ... 1 more > Caused by: org.apache.kafka.streams.errors.ProcessorStateException: task > [1_1] Failed to flush state store KSTREAM-AGGREGATE-STATE-STORE-07 > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:279) > at > org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:175) > at > org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:581) > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:535) > at > org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:660) > at > org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:628) > at > org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.suspendRunningTasks(AssignedStreamsTasks.java:145) > at > org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.suspendOrCloseTasks(AssignedStreamsTasks.java:128) > at > org.apache.kafka.streams.processor.internals.TaskManager.suspendActiveTasksAndState(TaskManager.java:246) > ... 7 more > Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer > attempted an operation with an old epoch. Either there is a newer producer > with the same transactionalId, or the producer's transaction has been expired > by the broker. > [2019-11-26 04:54:02,650] INFO > [stream-soak-test-4290196e-d805-4acd-9f78-b459cc7e99ee-StreamThread-2] > stream-thread >
[jira] [Commented] (KAFKA-8948) Flaky Test StreamsEosTest#test_rebalance_complex is broken for multiple runs
[ https://issues.apache.org/jira/browse/KAFKA-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987334#comment-16987334 ] Sophie Blee-Goldman commented on KAFKA-8948: Turns out this was most likely a real issue that just happened to disappear again for some reason – a bug in this system test was discovered to be causing failures after subsequent runs on the same machine since it did not properly clean up after the test. That bug (and most likely also this ticket) should be fixed by [https://github.com/apache/kafka/pull/7693] > Flaky Test StreamsEosTest#test_rebalance_complex is broken for multiple runs > > > Key: KAFKA-8948 > URL: https://issues.apache.org/jira/browse/KAFKA-8948 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Sophie Blee-Goldman >Priority: Blocker > Labels: system-tests > Attachments: KafkaService-worker5-server-start-stdout-stderr.log.tgz, > KafkaService-worker6-debug.tgz, > KafkaService-worker6-server-start-stdout-stderr.log.tgz, > KafkaService-worker7-debug.tgz, > KafkaService-worker7-server-start-stdout-stderr.log.tgz, > StreamsComplexEosTestDriverService-streams.log.tgz, > StreamsComplexEosTestDriverService-streams.stderr.tgz, > StreamsComplexEosTestDriverService-streams.stdout.tgz, > StreamsComplexEosTestJobRunningService-worker10-streams.log.tgz, > StreamsComplexEosTestJobRunningService-worker10-streams.stderr.tgz, > StreamsComplexEosTestJobRunningService-worker10-streams.stdout.tgz, > StreamsComplexEosTestJobRunningService-worker11-streams.log.tgz, > StreamsComplexEosTestJobRunningService-worker11-streams.stderr.tgz, > StreamsComplexEosTestJobRunningService-worker11-streams.stdout.tgz, > StreamsComplexEosTestJobRunningService-worker9-streams.log.tgz, > StreamsComplexEosTestJobRunningService-worker9-streams.stderr.tgz, > StreamsComplexEosTestJobRunningService-worker9-streams.stdout.tgz, > StreamsComplexEosTestVerifyRunnerService-streams.log.tgz, > StreamsComplexEosTestVerifyRunnerService-streams.stderr.tgz, > StreamsComplexEosTestVerifyRunnerService-streams.stdout.tgz, report.txt, > test_log.debug, test_log.info > > > When kicking off multiple runs of the system tests with `–repeat N` the > StreamsEosTest#test_rebalance_complex passes the first time but fails for all > subsequent runs. Logs from failing run attached -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9265) kafka.log.Log instances are leaking on log delete
[ https://issues.apache.org/jira/browse/KAFKA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987300#comment-16987300 ] ASF GitHub Bot commented on KAFKA-9265: --- soondenana commented on pull request #7773: KAFKA-9265: Fix kafka.log.Log instance leak on log deletion URL: https://github.com/apache/kafka/pull/7773 KAFKA-8448 fixes problem with similar leak. The Log objects are being held in ScheduledExecutor PeriodicProducerExpirationCheck callback. The fix in KAFKA-8448 was to change the policy of ScheduledExecutor to remove the scheduled task when it gets canceled (by calling setRemoveOnCancelPolicy(true)). This works when a log is closed using close() method. But when a log is deleted either when the topic gets deleted or when the rebalancing operation moves the replica away from broker, the delete() operation is invoked. Log.delete() doesn't close the pending scheduled task and that leaks Log instance. Fix is to close the scheduled task in the Log.delete() method too. Tested with and without this fix and Log instances are no longer leaked and scheduled tasks are gone once log is deleted. *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > kafka.log.Log instances are leaking on log delete > - > > Key: KAFKA-9265 > URL: https://issues.apache.org/jira/browse/KAFKA-9265 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > KAFKA-8448 fixes problem with similar leak. The {{Log}} objects are being > held in {{ScheduledExecutor}} {{PeriodicProducerExpirationCheck}} callback. > The fix in KAFKA-8448 was to change the policy of {{ScheduledExecutor}} to > remove the scheduled task when it gets canceled (by calling > {{setRemoveOnCancelPolicy(true)}}). > This works when a log is closed using {{close()}} method. But when a log is > deleted either when the topic gets deleted or when the rebalancing operation > moves the replica away from broker, the {{delete()}} operation is invoked. > {{Log.delete()}} doesn't close the pending scheduled task and that leaks Log > instance. > Fix is to close the scheduled task in the {{Log.delete()}} method too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9261) NPE when updating client metadata
[ https://issues.apache.org/jira/browse/KAFKA-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987293#comment-16987293 ] ASF GitHub Bot commented on KAFKA-9261: --- hachikuji commented on pull request #7772: KAFKA-9261; Client should handle inconsistent leader metadata URL: https://github.com/apache/kafka/pull/7772 This is a reduced scope fix. The purpose of this patch is to ensure that partition leader state is kept in sync with broker metadata in `MetadataCache` and consequently in `Cluster`. Due to the possibility of metadata event reordering, it was possible for this state to be inconsistent which could lead to an NPE in some cases. The test case here provides a specific scenario where this could happen. Also see #7770 for additional detail. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > NPE when updating client metadata > - > > Key: KAFKA-9261 > URL: https://issues.apache.org/jira/browse/KAFKA-9261 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > > We have seen the following exception recently: > {code} > java.lang.NullPointerException > at java.base/java.util.Objects.requireNonNull(Objects.java:221) > at org.apache.kafka.common.Cluster.(Cluster.java:134) > at org.apache.kafka.common.Cluster.(Cluster.java:89) > at > org.apache.kafka.clients.MetadataCache.computeClusterView(MetadataCache.java:120) > at org.apache.kafka.clients.MetadataCache.(MetadataCache.java:82) > at org.apache.kafka.clients.MetadataCache.(MetadataCache.java:58) > at > org.apache.kafka.clients.Metadata.handleMetadataResponse(Metadata.java:325) > at org.apache.kafka.clients.Metadata.update(Metadata.java:252) > at > org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.handleCompletedMetadataResponse(NetworkClient.java:1059) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:845) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:548) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1281) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > {code} > The client assumes that if a leader is included in the response, then node > information must also be available. There are at least a couple possible > reasons this assumption can fail: > 1. The client is able to detect stale partition metadata using leader epoch > information available. If stale partition metadata is detected, the client > ignores it and uses the last known metadata. However, it cannot detect stale > broker information and will always accept the latest update. This means that > the latest metadata may be a mix of multiple metadata responses and therefore > the invariant will not generally hold. > 2. There is no lock which protects both the fetching of partition metadata > and the live broker when handling a Metadata request. This means an > UpdateMetadata request can arrive concurrently and break the intended > invariant. > It seems case 2 has been possible for a long time, but it should be extremely > rare. Case 1 was only made possible with KIP-320, which added the leader > epoch tracking. It should also be rare, but the window for inconsistent > metadata is probably a bit bigger than the window for a concurrent update. > To fix this, we should make the client more defensive about metadata updates > and not assume that the leader is among the live endpoints. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7016) Reconsider the "avoid the expensive and useless stack trace for api exceptions" practice
[ https://issues.apache.org/jira/browse/KAFKA-7016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987289#comment-16987289 ] ASF GitHub Bot commented on KAFKA-7016: --- ableegoldman commented on pull request #7756: KAFKA-7016: Don't fill in stack traces URL: https://github.com/apache/kafka/pull/7756 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Reconsider the "avoid the expensive and useless stack trace for api > exceptions" practice > > > Key: KAFKA-7016 > URL: https://issues.apache.org/jira/browse/KAFKA-7016 > Project: Kafka > Issue Type: Bug >Reporter: Martin Vysny >Priority: Major > > I am trying to write a Kafka Consumer; upon running it only prints out: > {\{ org.apache.kafka.common.errors.InvalidGroupIdException: The configured > groupId is invalid}} > Note that the stack trace is missing, so that I have no information which > part of my code is bad and need fixing; I also have no information which > Kafka Client method has been called. Upon closer examination I found this in > ApiException: > > {{/* avoid the expensive and useless stack trace for api exceptions */}} > {{@Override}} > {{public Throwable fillInStackTrace() {}} > \{{ return this;}} > {{}}} > > I think it is a bad practice to hide all useful debugging info and trade it > for dubious performance gains. Exceptions are for exceptional code flow which > are allowed to be slow. > > This applies to kafka-clients 1.1.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9265) kafka.log.Log instances are leaking on log delete
Vikas Singh created KAFKA-9265: -- Summary: kafka.log.Log instances are leaking on log delete Key: KAFKA-9265 URL: https://issues.apache.org/jira/browse/KAFKA-9265 Project: Kafka Issue Type: Bug Reporter: Vikas Singh KAFKA-8448 fixes problem with similar leak. The {{Log}} objects are being held in {{ScheduledExecutor}} {{PeriodicProducerExpirationCheck}} callback. The fix in KAFKA-8448 was to change the policy of {{ScheduledExecutor}} to remove the scheduled task when it gets canceled (by calling {{setRemoveOnCancelPolicy(true)}}). This works when a log is closed using {{close()}} method. But when a log is deleted either when the topic gets deleted or when the rebalancing operation moves the replica away from broker, the {{delete()}} operation is invoked. {{Log.delete()}} doesn't close the pending scheduled task and that leaks Log instance. Fix is to close the scheduled task in the {{Log.delete()}} method too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9265) kafka.log.Log instances are leaking on log delete
[ https://issues.apache.org/jira/browse/KAFKA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-9265: -- Assignee: Vikas Singh > kafka.log.Log instances are leaking on log delete > - > > Key: KAFKA-9265 > URL: https://issues.apache.org/jira/browse/KAFKA-9265 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > KAFKA-8448 fixes problem with similar leak. The {{Log}} objects are being > held in {{ScheduledExecutor}} {{PeriodicProducerExpirationCheck}} callback. > The fix in KAFKA-8448 was to change the policy of {{ScheduledExecutor}} to > remove the scheduled task when it gets canceled (by calling > {{setRemoveOnCancelPolicy(true)}}). > This works when a log is closed using {{close()}} method. But when a log is > deleted either when the topic gets deleted or when the rebalancing operation > moves the replica away from broker, the {{delete()}} operation is invoked. > {{Log.delete()}} doesn't close the pending scheduled task and that leaks Log > instance. > Fix is to close the scheduled task in the {{Log.delete()}} method too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-6718) Rack Aware Stand-by Task Assignment for Kafka Streams
[ https://issues.apache.org/jira/browse/KAFKA-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987281#comment-16987281 ] John Roesler commented on KAFKA-6718: - Hey [~lkokhreidze]. I know Matthias pinged me a while back... I've just been re-queuing the task to look at this ticket every day to the next day. Sorry about that. I actually don't think that there's anything specific to worry about with respect to KIP-441. In my opinion, you can just design your feature against the current state of Streams, and whoever loses the race would just have to deal with adjusting the implementation to take the other feature into account. For what it's worth, though, I don't think there's too much semantic overlap, just the practical overlap that both efforts affect the same protocol and module. That's why I don't think we need to wait for each other _or_ try to unify the efforts. I don't know if you have any specific ideas, and I haven't had a chance to look into the details of the old work on this ticket. (This is what I haven't been able to find time to do). But sharing my immediate thoughts, I've used the "rack awareness" feature in Elasticsearch, and felt that it was a pretty straightforward and reasonable way to implement it. See https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html for more information. Basically, they let you add some arbitrary "tags"/"attrs" to each instance (in the config), and then there's a "rack awareness" config that takes a list of "tags" to consider. It's a pretty simple, but also powerful, arrangement. You might also want to consider the "forced awareness" section, which is in response to a concern that also applies to us. Thanks for picking it up! > Rack Aware Stand-by Task Assignment for Kafka Streams > - > > Key: KAFKA-6718 > URL: https://issues.apache.org/jira/browse/KAFKA-6718 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Deepak Goyal >Priority: Major > Labels: needs-kip > > |Machines in data centre are sometimes grouped in racks. Racks provide > isolation as each rack may be in a different physical location and has its > own power source. When tasks are properly replicated across racks, it > provides fault tolerance in that if a rack goes down, the remaining racks can > continue to serve traffic. > > This feature is already implemented at Kafka > [KIP-36|https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment] > but we needed similar for task assignments at Kafka Streams Application > layer. > > This features enables replica tasks to be assigned on different racks for > fault-tolerance. > NUM_STANDBY_REPLICAS = x > totalTasks = x+1 (replica + active) > # If there are no rackID provided: Cluster will behave rack-unaware > # If same rackId is given to all the nodes: Cluster will behave rack-unaware > # If (totalTasks <= number of racks), then Cluster will be rack aware i.e. > each replica task is each assigned to a different rack. > # Id (totalTasks > number of racks), then it will first assign tasks on > different racks, further tasks will be assigned to least loaded node, cluster > wide.| > We have added another config in StreamsConfig called "RACK_ID_CONFIG" which > helps StickyPartitionAssignor to assign tasks in such a way that no two > replica tasks are on same rack if possible. > Post that it also helps to maintain stickyness with-in the rack.| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9184) Redundant task creation and periodic rebalances after zombie worker rejoins the group
[ https://issues.apache.org/jira/browse/KAFKA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987279#comment-16987279 ] ASF GitHub Bot commented on KAFKA-9184: --- kkonstantine commented on pull request #7771: KAFKA-9184: Redundant task creation and periodic rebalances after zombie Connect worker rejoins the group URL: https://github.com/apache/kafka/pull/7771 Zombie workers, defined as workers that lose connectivity with the Kafka broker coordinator and get kicked out of the group but don't experience a jvm restart, have been keeping their tasks running. This side-effect is more disrupting with the new Incremental Cooperative rebalance protocol. When such workers return: a) they join the group with existing assignment and this leads to redundant tasks running in the Connect cluster, and b) they interfere with the computation of lost tasks, which before this fix would lead to the scheduled rebalance delay not being reset correctly back to 0. This results in periodic rebalances. This fix focuses on resolving the above side-effects as follows: * Each worker now tracks its connectivity with the broker coordinator in an unblocking manner. This allows the worker to detect that the broker coordinator is unreachable. The timeout is set to be equal to the heartbeat interval. If during this time the connection remains inactive, the worker will proactively stop all its connectors and tasks and will keep attempting to connect to the coordinator. * The incremental cooperative assignor will keep the delay to a positive value as long as it can detect lost tasks. If the set of tasks that are computed as lost becomes empty, the delay will be set to zero and no additional rebalancing will be scheduled. Besides the test included in this PR, the improvements are being tested with a framework that deploys a Connect cluster on docker images and introduces network partitions between all or selected workers and the Kafka brokers. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Redundant task creation and periodic rebalances after zombie worker rejoins > the group > - > > Key: KAFKA-9184 > URL: https://issues.apache.org/jira/browse/KAFKA-9184 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0, 2.3.2 >Reporter: Konstantine Karantasis >Assignee: Konstantine Karantasis >Priority: Blocker > Fix For: 2.4.0, 2.3.2 > > > First reported here: > https://stackoverflow.com/questions/58631092/kafka-connect-assigns-same-task-to-multiple-workers > There seems to be an issue with task reassignment when a worker rejoins after > an unsuccessful join request. The worker seems to be outside the group for a > generation but when it joins again the same task is running in more than one > worker -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9184) Redundant task creation and periodic rebalances after zombie worker rejoins the group
[ https://issues.apache.org/jira/browse/KAFKA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantine Karantasis updated KAFKA-9184: -- Summary: Redundant task creation and periodic rebalances after zombie worker rejoins the group (was: Redundant task creation after worker fails to join a specific group generation) > Redundant task creation and periodic rebalances after zombie worker rejoins > the group > - > > Key: KAFKA-9184 > URL: https://issues.apache.org/jira/browse/KAFKA-9184 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.3.2 >Reporter: Konstantine Karantasis >Assignee: Konstantine Karantasis >Priority: Blocker > Fix For: 2.3.2 > > > First reported here: > https://stackoverflow.com/questions/58631092/kafka-connect-assigns-same-task-to-multiple-workers > There seems to be an issue with task reassignment when a worker rejoins after > an unsuccessful join request. The worker seems to be outside the group for a > generation but when it joins again the same task is running in more than one > worker -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9184) Redundant task creation and periodic rebalances after zombie worker rejoins the group
[ https://issues.apache.org/jira/browse/KAFKA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantine Karantasis updated KAFKA-9184: -- Fix Version/s: 2.4.0 > Redundant task creation and periodic rebalances after zombie worker rejoins > the group > - > > Key: KAFKA-9184 > URL: https://issues.apache.org/jira/browse/KAFKA-9184 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0, 2.3.2 >Reporter: Konstantine Karantasis >Assignee: Konstantine Karantasis >Priority: Blocker > Fix For: 2.4.0, 2.3.2 > > > First reported here: > https://stackoverflow.com/questions/58631092/kafka-connect-assigns-same-task-to-multiple-workers > There seems to be an issue with task reassignment when a worker rejoins after > an unsuccessful join request. The worker seems to be outside the group for a > generation but when it joins again the same task is running in more than one > worker -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9184) Redundant task creation and periodic rebalances after zombie worker rejoins the group
[ https://issues.apache.org/jira/browse/KAFKA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantine Karantasis updated KAFKA-9184: -- Affects Version/s: 2.4.0 > Redundant task creation and periodic rebalances after zombie worker rejoins > the group > - > > Key: KAFKA-9184 > URL: https://issues.apache.org/jira/browse/KAFKA-9184 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0, 2.3.2 >Reporter: Konstantine Karantasis >Assignee: Konstantine Karantasis >Priority: Blocker > Fix For: 2.3.2 > > > First reported here: > https://stackoverflow.com/questions/58631092/kafka-connect-assigns-same-task-to-multiple-workers > There seems to be an issue with task reassignment when a worker rejoins after > an unsuccessful join request. The worker seems to be outside the group for a > generation but when it joins again the same task is running in more than one > worker -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9261) NPE when updating client metadata
[ https://issues.apache.org/jira/browse/KAFKA-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987217#comment-16987217 ] ASF GitHub Bot commented on KAFKA-9261: --- hachikuji commented on pull request #7770: KAFKA-9261; Client should handle unavailable leader metadata URL: https://github.com/apache/kafka/pull/7770 The client caches metadata fetched from Metadata requests. Previously, each metadata response overwrote all of the metadata from the previous one, so we could rely on the expectation that the broker only returned the leaderId for a partition if it had connection information available. This behavior changed with KIP-320 since having the leader epoch allows the client to filter out partition metadata which is known to be stale. However, because of this, we can no longer rely on the request-level guarantee of leader availability. There is no mechanism similar to the leader epoch to track the staleness of broker metadata, so we still overwrite all of the broker metadata from each response, which means that the partition metadata can get out of sync with the broker metadata in the client's cache. Hence it is no longer safe to validate inside the `Cluster` constructor that each leader has an associated `Node` Fixing this issue was unfortunately not straightforward because the cache was built to maintain references to broker metadata through the `Node` object at the partition level. In order to keep the state consistent, each `Node` reference would need to be updated based on the new broker metadata. Instead of doing that, this patch changes the cache so that it is structured more closely with the Metadata response schema. Broker node information is maintained at the top level in a single collection and cached partition metadata only references the id of the broker. To accommodate this, we have removed `PartitionInfoAndEpoch` and we have altered `MetadataResponse.PartitionMetadata` to eliminate its `Node` references. Note that one of the side benefits of the refactor here is that we virtually eliminate one of the hotspots in Metadata request handling in `MetadataCache.getEndpoints (which was renamed to `maybeFilterAliveReplicas`). The only reason this was expensive was because we had to build a new collection for the `Node` representations of each of the replica lists. This information was doomed to just get discarded on serialization, so the whole effort was wasteful. Now, we work with the lower level id lists and no copy of the replicas is needed (at least for all versions other than 0). ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > NPE when updating client metadata > - > > Key: KAFKA-9261 > URL: https://issues.apache.org/jira/browse/KAFKA-9261 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > > We have seen the following exception recently: > {code} > java.lang.NullPointerException > at java.base/java.util.Objects.requireNonNull(Objects.java:221) > at org.apache.kafka.common.Cluster.(Cluster.java:134) > at org.apache.kafka.common.Cluster.(Cluster.java:89) > at > org.apache.kafka.clients.MetadataCache.computeClusterView(MetadataCache.java:120) > at org.apache.kafka.clients.MetadataCache.(MetadataCache.java:82) > at org.apache.kafka.clients.MetadataCache.(MetadataCache.java:58) > at > org.apache.kafka.clients.Metadata.handleMetadataResponse(Metadata.java:325) > at org.apache.kafka.clients.Metadata.update(Metadata.java:252) > at > org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.handleCompletedMetadataResponse(NetworkClient.java:1059) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:845) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:548) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:262) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1281) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at >
[jira] [Commented] (KAFKA-9154) ProducerId generation should be managed by the Controller
[ https://issues.apache.org/jira/browse/KAFKA-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987156#comment-16987156 ] Satish Duggana commented on KAFKA-9154: --- [~cmccabe] Please let me know if I can take this up. > ProducerId generation should be managed by the Controller > - > > Key: KAFKA-9154 > URL: https://issues.apache.org/jira/browse/KAFKA-9154 > Project: Kafka > Issue Type: Sub-task > Components: core >Reporter: Viktor Somogyi-Vass >Assignee: Colin McCabe >Priority: Major > > Currently producerIds are maintained in Zookeeper but in the future we'd like > them to be managed by the controller quorum in an internal topic. > The reason for storing this in Zookeeper was that this must be unique across > the cluster. In this task it should be refactored such that the > TransactionManager turns to the Controller for a ProducerId which connects to > Zookeeper to acquire this ID. Since ZK is the single source of truth and the > PID won't be cached anywhere it should be safe (just one extra hop added). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8937) Flaky Test SaslPlainSslEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl
[ https://issues.apache.org/jira/browse/KAFKA-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987067#comment-16987067 ] Rajini Sivaram commented on KAFKA-8937: --- This is a bug in the implementation and is being fixed under KAFKA-9190. > Flaky Test > SaslPlainSslEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl > -- > > Key: KAFKA-8937 > URL: https://issues.apache.org/jira/browse/KAFKA-8937 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Sophie Blee-Goldman >Priority: Major > Labels: flaky-test > > h3. Error Message > org.scalatest.exceptions.TestFailedException: Consumed 0 records before > timeout instead of the expected 1 records > h3. Stacktrace > org.scalatest.exceptions.TestFailedException: Consumed 0 records before > timeout instead of the expected 1 records at > org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions$class.fail(Assertions.scala:1091) at > org.scalatest.Assertions$.fail(Assertions.scala:1389) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:830) at > kafka.utils.TestUtils$.pollRecordsUntilTrue(TestUtils.scala:792) at > kafka.utils.TestUtils$.pollUntilAtLeastNumRecords(TestUtils.scala:1324) at > kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1333) at > kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:529) > at > kafka.api.EndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl(EndToEndAuthorizationTest.scala:368) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at > org.junit.runners.ParentRunner.run(ParentRunner.java:412) at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > at > org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33) > at >
[jira] [Commented] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987057#comment-16987057 ] ASF GitHub Bot commented on KAFKA-9203: --- ijuma commented on pull request #7769: KAFKA-9203: Revert "MINOR: Remove workarounds for lz4-java bug affecting byte buffers (#6679)" URL: https://github.com/apache/kafka/pull/7769 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > kafka-client 2.3.1 fails to consume lz4 compressed topic > > > Key: KAFKA-9203 > URL: https://issues.apache.org/jira/browse/KAFKA-9203 > Project: Kafka > Issue Type: Bug > Components: compression, consumer >Affects Versions: 2.3.0, 2.3.1 >Reporter: David Watzke >Assignee: Ismael Juma >Priority: Blocker > Fix For: 2.4.0, 2.3.2 > > Attachments: kafka-clients-2.3.2-SNAPSHOT.jar > > > I run kafka cluster 2.1.1 > when I upgraded the consumer app to use kafka-client 2.3.0 (or 2.3.1) instead > of 2.2.0, I immediately started getting the following exceptions in a loop > when consuming a topic with LZ4-compressed messages: > {noformat} > 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] > com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred > while polling and processing messages: org.apache.kafka.common.KafkaExce > ption: Received exception when fetching the next record from > FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue > consumption. > org.apache.kafka.common.KafkaException: Received exception when fetching the > next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the > record to continue consumption. > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) > > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) > > at > com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at > resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) > at > resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) > > at > resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) > > at > resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) > > at > resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) > > at > resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) > > at > resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at >
[jira] [Updated] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
[ https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9263: Affects Version/s: 2.4.0 > Reocurrence: Transient failure in > kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs > --- > > Key: KAFKA-9263 > URL: https://issues.apache.org/jira/browse/KAFKA-9263 > Project: Kafka > Issue Type: Bug > Components: unit tests >Affects Versions: 2.4.0 >Reporter: John Roesler >Priority: Major > > This test has failed for me on > https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ > {noformat} > Error Message > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > Stacktrace > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) > at org.scalatest.Assertions.fail$(Assertions.scala:1087) > at org.scalatest.Assertions$.fail(Assertions.scala:1389) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) > at > kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:834) > Standard Output > [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,091] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host
[jira] [Updated] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
[ https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9263: Fix Version/s: (was: 1.1.0) > Reocurrence: Transient failure in > kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs > --- > > Key: KAFKA-9263 > URL: https://issues.apache.org/jira/browse/KAFKA-9263 > Project: Kafka > Issue Type: Bug > Components: unit tests >Reporter: John Roesler >Priority: Major > > This test has failed for me on > https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ > {noformat} > Error Message > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > Stacktrace > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) > at org.scalatest.Assertions.fail$(Assertions.scala:1087) > at org.scalatest.Assertions$.fail(Assertions.scala:1389) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) > at > kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:834) > Standard Output > [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,091] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. >
[jira] [Updated] (KAFKA-9264) Reocurrence: Flaky Test DelegationTokenEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl
[ https://issues.apache.org/jira/browse/KAFKA-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9264: Reporter: John Roesler (was: Matthias J. Sax) > Reocurrence: Flaky Test > DelegationTokenEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl > -- > > Key: KAFKA-9264 > URL: https://issues.apache.org/jira/browse/KAFKA-9264 > Project: Kafka > Issue Type: Bug > Components: core, security, unit tests >Reporter: John Roesler >Priority: Major > Labels: flaky-test > > This test has failed again for me, so apparently it's not fixed: > https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/DelegationTokenEndToEndAuthorizationTest/testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl/ > {noformat} > Error Message > org.scalatest.exceptions.TestFailedException: Consumed 0 records before > timeout instead of the expected 1 records > Stacktrace > org.scalatest.exceptions.TestFailedException: Consumed 0 records before > timeout instead of the expected 1 records > at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) > at org.scalatest.Assertions.fail$(Assertions.scala:1087) > at org.scalatest.Assertions$.fail(Assertions.scala:1389) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) > at kafka.utils.TestUtils$.pollRecordsUntilTrue(TestUtils.scala:796) > at > kafka.utils.TestUtils$.pollUntilAtLeastNumRecords(TestUtils.scala:1335) > at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1343) > at > kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:530) > at > kafka.api.EndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl(EndToEndAuthorizationTest.scala:369) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) > at org.junit.runners.ParentRunner.run(ParentRunner.java:412) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) > at >
[jira] [Updated] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
[ https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9263: Description: This test has failed for me on https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ {noformat} {noformat} was: Saw this error once on Jenkins: https://builds.apache.org/job/kafka-pr-jdk9-scala2.12/3025/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ {code} Stacktrace java.lang.AssertionError: timed out waiting for message produce at kafka.utils.TestUtils$.fail(TestUtils.scala:347) at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:861) at kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:357) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.lang.Thread.run(Thread.java:844) Standard Output [2017-12-07 19:22:56,297] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:22:59,447] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:22:59,453] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:01,335] ERROR Error while creating ephemeral at /controller, node already exists and owner '99134641238966279' does not match current session '99134641238966277' (kafka.zk.KafkaZkClient$CheckedEphemeral:71) [2017-12-07 19:23:04,695] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:04,760] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:06,764] ERROR Error while creating ephemeral at /controller, node already exists and owner '99134641586700293' does not match current session '99134641586700295' (kafka.zk.KafkaZkClient$CheckedEphemeral:71) [2017-12-07 19:23:09,379] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:09,387] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:11,533] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:11,539] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:13,022] ERROR Error while creating ephemeral at /controller, node already exists and owner '99134642031034375' does not match current session '99134642031034373' (kafka.zk.KafkaZkClient$CheckedEphemeral:71) [2017-12-07 19:23:14,667] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:14,673] ERROR
[jira] [Assigned] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
[ https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler reassigned KAFKA-9263: --- Assignee: (was: Dong Lin) > Reocurrence: Transient failure in > kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs > --- > > Key: KAFKA-9263 > URL: https://issues.apache.org/jira/browse/KAFKA-9263 > Project: Kafka > Issue Type: Bug > Components: unit tests >Reporter: Guozhang Wang >Priority: Major > Fix For: 1.1.0 > > > This test has failed for me on > https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ > {noformat} > Error Message > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > Stacktrace > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) > at org.scalatest.Assertions.fail$(Assertions.scala:1087) > at org.scalatest.Assertions$.fail(Assertions.scala:1389) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) > at > kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:834) > Standard Output > [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,091] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server >
[jira] [Updated] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
[ https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9263: Reporter: John Roesler (was: Guozhang Wang) > Reocurrence: Transient failure in > kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs > --- > > Key: KAFKA-9263 > URL: https://issues.apache.org/jira/browse/KAFKA-9263 > Project: Kafka > Issue Type: Bug > Components: unit tests >Reporter: John Roesler >Priority: Major > Fix For: 1.1.0 > > > This test has failed for me on > https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ > {noformat} > Error Message > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > Stacktrace > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) > at org.scalatest.Assertions.fail$(Assertions.scala:1087) > at org.scalatest.Assertions$.fail(Assertions.scala:1389) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) > at > kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:834) > Standard Output > [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,091] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This
[jira] [Updated] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
[ https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9263: Component/s: (was: unit tests) clients > Reocurrence: Transient failure in > kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs > --- > > Key: KAFKA-9263 > URL: https://issues.apache.org/jira/browse/KAFKA-9263 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 2.4.0 >Reporter: John Roesler >Priority: Major > > This test has failed for me on > https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ > {noformat} > Error Message > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > Stacktrace > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) > at org.scalatest.Assertions.fail$(Assertions.scala:1087) > at org.scalatest.Assertions$.fail(Assertions.scala:1389) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) > at > kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:834) > Standard Output > [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,091] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException:
[jira] [Updated] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
[ https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9263: Labels: flaky-test (was: ) > Reocurrence: Transient failure in > kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs > --- > > Key: KAFKA-9263 > URL: https://issues.apache.org/jira/browse/KAFKA-9263 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 2.4.0 >Reporter: John Roesler >Priority: Major > Labels: flaky-test > > This test has failed for me on > https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ > {noformat} > Error Message > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > Stacktrace > org.scalatest.exceptions.TestFailedException: only 0 messages are produced > within timeout after replica movement. Producer future > Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for > 1 ms.)) > at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions.fail(Assertions.scala:1091) > at org.scalatest.Assertions.fail$(Assertions.scala:1087) > at org.scalatest.Assertions$.fail(Assertions.scala:1389) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) > at > kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.lang.Thread.run(Thread.java:834) > Standard Output > [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-12-03 04:54:27,091] ERROR [ReplicaFetcher replicaId=0, leaderId=1, > fetcherId=0] Error for partition unclean-test-topic-1-1 at offset 0 > (kafka.server.ReplicaFetcherThread:76) >
[jira] [Created] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
John Roesler created KAFKA-9263: --- Summary: Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs Key: KAFKA-9263 URL: https://issues.apache.org/jira/browse/KAFKA-9263 Project: Kafka Issue Type: Bug Components: unit tests Reporter: Guozhang Wang Assignee: Dong Lin Fix For: 1.1.0 Saw this error once on Jenkins: https://builds.apache.org/job/kafka-pr-jdk9-scala2.12/3025/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ {code} Stacktrace java.lang.AssertionError: timed out waiting for message produce at kafka.utils.TestUtils$.fail(TestUtils.scala:347) at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:861) at kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:357) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.lang.Thread.run(Thread.java:844) Standard Output [2017-12-07 19:22:56,297] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:22:59,447] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:22:59,453] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:01,335] ERROR Error while creating ephemeral at /controller, node already exists and owner '99134641238966279' does not match current session '99134641238966277' (kafka.zk.KafkaZkClient$CheckedEphemeral:71) [2017-12-07 19:23:04,695] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:04,760] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:06,764] ERROR Error while creating ephemeral at /controller, node already exists and owner '99134641586700293' does not match current session '99134641586700295' (kafka.zk.KafkaZkClient$CheckedEphemeral:71) [2017-12-07 19:23:09,379] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:09,387] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:11,533] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:11,539] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-12-07 19:23:13,022] ERROR Error while creating ephemeral at /controller, node already exists and owner '99134642031034375' does not match current session '99134642031034373' (kafka.zk.KafkaZkClient$CheckedEphemeral:71) [2017-12-07 19:23:14,667] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action
[jira] [Updated] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs
[ https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9263: Description: This test has failed for me on https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/ {noformat} Error Message org.scalatest.exceptions.TestFailedException: only 0 messages are produced within timeout after replica movement. Producer future Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for 1 ms.)) Stacktrace org.scalatest.exceptions.TestFailedException: only 0 messages are produced within timeout after replica movement. Producer future Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for 1 ms.)) at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) at org.scalatest.Assertions.fail(Assertions.scala:1091) at org.scalatest.Assertions.fail$(Assertions.scala:1087) at org.scalatest.Assertions$.fail(Assertions.scala:1389) at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) at kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.lang.Thread.run(Thread.java:834) Standard Output [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition topic-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Error for partition topic-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. [2019-12-03 04:54:27,091] ERROR [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition unclean-test-topic-1-1 at offset 0 (kafka.server.ReplicaFetcherThread:76) org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. [2019-12-03 04:54:35,263] ERROR [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition elect-preferred-leaders-topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. [2019-12-03 04:54:35,264] ERROR [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition elect-preferred-leaders-topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. [2019-12-03 04:54:35,463] ERROR [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition
[jira] [Updated] (KAFKA-9264) Reocurrence: Flaky Test DelegationTokenEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl
[ https://issues.apache.org/jira/browse/KAFKA-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler updated KAFKA-9264: Description: This test has failed again for me, so apparently it's not fixed: https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/DelegationTokenEndToEndAuthorizationTest/testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl/ {noformat} Error Message org.scalatest.exceptions.TestFailedException: Consumed 0 records before timeout instead of the expected 1 records Stacktrace org.scalatest.exceptions.TestFailedException: Consumed 0 records before timeout instead of the expected 1 records at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) at org.scalatest.Assertions.fail(Assertions.scala:1091) at org.scalatest.Assertions.fail$(Assertions.scala:1087) at org.scalatest.Assertions$.fail(Assertions.scala:1389) at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842) at kafka.utils.TestUtils$.pollRecordsUntilTrue(TestUtils.scala:796) at kafka.utils.TestUtils$.pollUntilAtLeastNumRecords(TestUtils.scala:1335) at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1343) at kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:530) at kafka.api.EndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl(EndToEndAuthorizationTest.scala:369) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at org.junit.runners.ParentRunner.run(ParentRunner.java:412) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36) at
[jira] [Created] (KAFKA-9264) Reocurrence: Flaky Test DelegationTokenEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl
John Roesler created KAFKA-9264: --- Summary: Reocurrence: Flaky Test DelegationTokenEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl Key: KAFKA-9264 URL: https://issues.apache.org/jira/browse/KAFKA-9264 Project: Kafka Issue Type: Bug Components: core, security, unit tests Reporter: Matthias J. Sax [https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/2645/testReport/junit/kafka.api/DelegationTokenEndToEndAuthorizationTest/testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl/] {quote}org.scalatest.exceptions.TestFailedException: Consumed 0 records before timeout instead of the expected 1 records at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) at org.scalatest.Assertions.fail(Assertions.scala:1091) at org.scalatest.Assertions.fail$(Assertions.scala:1087) at org.scalatest.Assertions$.fail(Assertions.scala:1389) at kafka.utils.TestUtils$.pollUntilAtLeastNumRecords(TestUtils.scala:841) at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1342) at kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:529) at kafka.api.EndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl(EndToEndAuthorizationTest.scala:368){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (KAFKA-8937) Flaky Test SaslPlainSslEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl
[ https://issues.apache.org/jira/browse/KAFKA-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Roesler reopened KAFKA-8937: - This test has just failed again for me. Apparently, it is not fixed. Unfortunately, Jenkins didn't capture the logs or exception, so the only thing I have is the console output: {noformat} kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl FAILED org.scalatest.exceptions.TestFailedException: Consumed 0 records before timeout instead of the expected 1 records {noformat} I doubt this is enough to go on, though... https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/26926/consoleText > Flaky Test > SaslPlainSslEndToEndAuthorizationTest#testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl > -- > > Key: KAFKA-8937 > URL: https://issues.apache.org/jira/browse/KAFKA-8937 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Sophie Blee-Goldman >Priority: Major > Labels: flaky-test > > h3. Error Message > org.scalatest.exceptions.TestFailedException: Consumed 0 records before > timeout instead of the expected 1 records > h3. Stacktrace > org.scalatest.exceptions.TestFailedException: Consumed 0 records before > timeout instead of the expected 1 records at > org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:530) > at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) > at org.scalatest.Assertions$class.fail(Assertions.scala:1091) at > org.scalatest.Assertions$.fail(Assertions.scala:1389) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:830) at > kafka.utils.TestUtils$.pollRecordsUntilTrue(TestUtils.scala:792) at > kafka.utils.TestUtils$.pollUntilAtLeastNumRecords(TestUtils.scala:1324) at > kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1333) at > kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:529) > at > kafka.api.EndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl(EndToEndAuthorizationTest.scala:368) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at > org.junit.runners.ParentRunner.run(ParentRunner.java:412) at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at >
[jira] [Commented] (KAFKA-9173) StreamsPartitionAssignor assigns partitions to only one worker
[ https://issues.apache.org/jira/browse/KAFKA-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987017#comment-16987017 ] Oleg Muravskiy commented on KAFKA-9173: --- Hi [~ableegoldman], I think I'm missing something there. I have 210 partitions to consume from, and I have configured Streams to run with 200 threads. Why do I have only 10 tasks? > StreamsPartitionAssignor assigns partitions to only one worker > -- > > Key: KAFKA-9173 > URL: https://issues.apache.org/jira/browse/KAFKA-9173 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.3.0, 2.2.1 >Reporter: Oleg Muravskiy >Priority: Major > Labels: user-experience > Attachments: StreamsPartitionAssignor.log > > > I'm running a distributed KafkaStreams application on 10 worker nodes, > subscribed to 21 topics with 10 partitions in each. I'm only using a > Processor interface, and a persistent state store. > However, only one worker gets assigned partitions, all other workers get > nothing. Restarting the application, or cleaning local state stores does not > help. StreamsPartitionAssignor migrates to other nodes, and eventually picks > up other node to assign partitions to, but still only one node. > It's difficult to figure out where to look for the signs of problems, I'm > attaching the log messages from the StreamsPartitionAssignor. Let me know > what else I could provide to help resolve this. > [^StreamsPartitionAssignor.log] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9258) Connect ConnectorStatusMetricsGroup sometimes NPE
[ https://issues.apache.org/jira/browse/KAFKA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar updated KAFKA-9258: - Priority: Blocker (was: Major) > Connect ConnectorStatusMetricsGroup sometimes NPE > - > > Key: KAFKA-9258 > URL: https://issues.apache.org/jira/browse/KAFKA-9258 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0 >Reporter: Cyrus Vafadari >Priority: Blocker > Fix For: 2.4.0 > > > java.lang.NullPointerException > at > org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:901) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTask(Worker.java:720) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTasks(Worker.java:740) > at > org.apache.kafka.connect.runtime.Worker.stopAndAwaitTasks(Worker.java:758) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.processTaskConfigUpdatesWithIncrementalCooperative(DistributedHerder.java:575) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:396) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:282) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) > at java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9258) Connect ConnectorStatusMetricsGroup sometimes NPE
[ https://issues.apache.org/jira/browse/KAFKA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar updated KAFKA-9258: - Fix Version/s: 2.4.0 > Connect ConnectorStatusMetricsGroup sometimes NPE > - > > Key: KAFKA-9258 > URL: https://issues.apache.org/jira/browse/KAFKA-9258 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.4.0 >Reporter: Cyrus Vafadari >Priority: Major > Fix For: 2.4.0 > > > java.lang.NullPointerException > at > org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:901) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTask(Worker.java:720) > at > org.apache.kafka.connect.runtime.Worker.awaitStopTasks(Worker.java:740) > at > org.apache.kafka.connect.runtime.Worker.stopAndAwaitTasks(Worker.java:758) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.processTaskConfigUpdatesWithIncrementalCooperative(DistributedHerder.java:575) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:396) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:282) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) > at java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma reassigned KAFKA-9203: -- Assignee: Ismael Juma > kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1 > --- > > Key: KAFKA-9203 > URL: https://issues.apache.org/jira/browse/KAFKA-9203 > Project: Kafka > Issue Type: Bug > Components: compression, consumer >Affects Versions: 2.3.0, 2.3.1 >Reporter: David Watzke >Assignee: Ismael Juma >Priority: Blocker > Fix For: 2.4.0, 2.3.2 > > Attachments: kafka-clients-2.3.2-SNAPSHOT.jar > > > I run kafka cluster 2.1.1 > when I upgraded the consumer app to use kafka-client 2.3.0 (or 2.3.1) instead > of 2.2.0, I immediately started getting the following exceptions in a loop > when consuming a topic with LZ4-compressed messages: > {noformat} > 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] > com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred > while polling and processing messages: org.apache.kafka.common.KafkaExce > ption: Received exception when fetching the next record from > FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue > consumption. > org.apache.kafka.common.KafkaException: Received exception when fetching the > next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the > record to continue consumption. > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) > > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) > > at > com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at > resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) > at > resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) > > at > resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) > > at > resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) > > at > resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) > > at > resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) > > at > resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at >
[jira] [Updated] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-9203: --- Priority: Blocker (was: Critical) > kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1 > --- > > Key: KAFKA-9203 > URL: https://issues.apache.org/jira/browse/KAFKA-9203 > Project: Kafka > Issue Type: Bug > Components: compression, consumer >Affects Versions: 2.3.0, 2.3.1 >Reporter: David Watzke >Priority: Blocker > Fix For: 2.4.0, 2.3.2 > > Attachments: kafka-clients-2.3.2-SNAPSHOT.jar > > > I run kafka cluster 2.1.1 > when I upgraded the consumer app to use kafka-client 2.3.0 (or 2.3.1) instead > of 2.2.0, I immediately started getting the following exceptions in a loop > when consuming a topic with LZ4-compressed messages: > {noformat} > 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] > com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred > while polling and processing messages: org.apache.kafka.common.KafkaExce > ption: Received exception when fetching the next record from > FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue > consumption. > org.apache.kafka.common.KafkaException: Received exception when fetching the > next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the > record to continue consumption. > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) > > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) > > at > com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at > resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) > at > resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) > > at > resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) > > at > resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) > > at > resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) > > at > resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) > > at > resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at >
[jira] [Updated] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-9203: --- Fix Version/s: 2.3.2 2.4.0 > kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1 > --- > > Key: KAFKA-9203 > URL: https://issues.apache.org/jira/browse/KAFKA-9203 > Project: Kafka > Issue Type: Bug > Components: compression, consumer >Affects Versions: 2.3.0, 2.3.1 >Reporter: David Watzke >Priority: Critical > Fix For: 2.4.0, 2.3.2 > > Attachments: kafka-clients-2.3.2-SNAPSHOT.jar > > > I run kafka cluster 2.1.1 > when I upgraded the consumer app to use kafka-client 2.3.0 (or 2.3.1) instead > of 2.2.0, I immediately started getting the following exceptions in a loop > when consuming a topic with LZ4-compressed messages: > {noformat} > 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] > com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred > while polling and processing messages: org.apache.kafka.common.KafkaExce > ption: Received exception when fetching the next record from > FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue > consumption. > org.apache.kafka.common.KafkaException: Received exception when fetching the > next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the > record to continue consumption. > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) > > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) > > at > com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at > resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) > at > resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) > > at > resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) > > at > resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) > > at > resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) > > at > resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) > > at > resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at >
[jira] [Updated] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-9203: --- Summary: kafka-client 2.3.1 fails to consume lz4 compressed topic (was: kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1) > kafka-client 2.3.1 fails to consume lz4 compressed topic > > > Key: KAFKA-9203 > URL: https://issues.apache.org/jira/browse/KAFKA-9203 > Project: Kafka > Issue Type: Bug > Components: compression, consumer >Affects Versions: 2.3.0, 2.3.1 >Reporter: David Watzke >Assignee: Ismael Juma >Priority: Blocker > Fix For: 2.4.0, 2.3.2 > > Attachments: kafka-clients-2.3.2-SNAPSHOT.jar > > > I run kafka cluster 2.1.1 > when I upgraded the consumer app to use kafka-client 2.3.0 (or 2.3.1) instead > of 2.2.0, I immediately started getting the following exceptions in a loop > when consuming a topic with LZ4-compressed messages: > {noformat} > 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] > com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred > while polling and processing messages: org.apache.kafka.common.KafkaExce > ption: Received exception when fetching the next record from > FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue > consumption. > org.apache.kafka.common.KafkaException: Received exception when fetching the > next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the > record to continue consumption. > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) > > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) > > at > com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at > resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) > at > resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) > > at > resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) > > at > resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) > > at > resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) > > at > resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) > > at > resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at >
[jira] [Commented] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986929#comment-16986929 ] Ismael Juma commented on KAFKA-9203: [~dwatzke] Thanks for testing! I submitted a PR to revert the change as an immediate step and will try to resubmit the original change after I understand the underlying issue. > kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1 > --- > > Key: KAFKA-9203 > URL: https://issues.apache.org/jira/browse/KAFKA-9203 > Project: Kafka > Issue Type: Bug > Components: compression, consumer >Affects Versions: 2.3.0, 2.3.1 >Reporter: David Watzke >Priority: Critical > Attachments: kafka-clients-2.3.2-SNAPSHOT.jar > > > I run kafka cluster 2.1.1 > when I upgraded the consumer app to use kafka-client 2.3.0 (or 2.3.1) instead > of 2.2.0, I immediately started getting the following exceptions in a loop > when consuming a topic with LZ4-compressed messages: > {noformat} > 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] > com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred > while polling and processing messages: org.apache.kafka.common.KafkaExce > ption: Received exception when fetching the next record from > FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue > consumption. > org.apache.kafka.common.KafkaException: Received exception when fetching the > next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the > record to continue consumption. > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) > > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) > > at > com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at > resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) > at > resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) > > at > resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) > > at > resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) > > at > resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) > > at > resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) > > at > resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) >
[jira] [Commented] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986927#comment-16986927 ] ASF GitHub Bot commented on KAFKA-9203: --- ijuma commented on pull request #7769: KAFKA-9203: Revert "MINOR: Remove workarounds for lz4-java bug affecting byte buffers (#6679)" URL: https://github.com/apache/kafka/pull/7769 This reverts commit 90043d5f as it caused a regression in some cases: > Caused by: java.io.IOException: Stream frame descriptor corrupted > at org.apache.kafka.common.record.KafkaLZ4BlockInputStream.readHeader(KafkaLZ4BlockInputStream.java:132) > at org.apache.kafka.common.record.KafkaLZ4BlockInputStream.(KafkaLZ4BlockInputStream.java:78) > at org.apache.kafka.common.record.CompressionType$4.wrapForInput(CompressionType.java:110) I will investigate why after, but I want to get the safe fix into 2.4.0. The reporter of KAFKA-9203 has verified that reverting this change makes the problem go away. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1 > --- > > Key: KAFKA-9203 > URL: https://issues.apache.org/jira/browse/KAFKA-9203 > Project: Kafka > Issue Type: Bug > Components: compression, consumer >Affects Versions: 2.3.0, 2.3.1 >Reporter: David Watzke >Priority: Critical > Attachments: kafka-clients-2.3.2-SNAPSHOT.jar > > > I run kafka cluster 2.1.1 > when I upgraded the consumer app to use kafka-client 2.3.0 (or 2.3.1) instead > of 2.2.0, I immediately started getting the following exceptions in a loop > when consuming a topic with LZ4-compressed messages: > {noformat} > 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] > com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred > while polling and processing messages: org.apache.kafka.common.KafkaExce > ption: Received exception when fetching the next record from > FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue > consumption. > org.apache.kafka.common.KafkaException: Received exception when fetching the > next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the > record to continue consumption. > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) > > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) > > at > com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at > resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) > at > resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) > > at >
[jira] [Commented] (KAFKA-9071) transactional.id.expiration.ms config value should be implemented as a Long
[ https://issues.apache.org/jira/browse/KAFKA-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986881#comment-16986881 ] Mario Georgiev commented on KAFKA-9071: --- Hello [~mjsax] whom is appropriate to tag for a review? Or just choose the lead of the appropriate component? > transactional.id.expiration.ms config value should be implemented as a Long > --- > > Key: KAFKA-9071 > URL: https://issues.apache.org/jira/browse/KAFKA-9071 > Project: Kafka > Issue Type: Improvement > Components: config >Affects Versions: 2.3.0 >Reporter: Joost van de Wijgerd >Assignee: Mario Georgiev >Priority: Major > > Currently the value of this config parameter is limited to MAX_INT > effectively limiting the transactional id expiration to ~ 25 days. This is > causing some issues for us on our Acceptance environment (which is not used > that often / heavily) where our transactional services will start failing > because if this issue. > I believe best practice for millisecond values should be to implement them as > a Long and not as an Integer > this is currently the max value: transactional.id.expiration.ms=2147483647 > while I would like to set it to: transactional.id.expiration.ms=3154000 > (i.e. 1 year) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9262) kafka broker fails to response fetching request from consumer when quota limit is violated
Junqi Zhang created KAFKA-9262: -- Summary: kafka broker fails to response fetching request from consumer when quota limit is violated Key: KAFKA-9262 URL: https://issues.apache.org/jira/browse/KAFKA-9262 Project: Kafka Issue Type: Bug Affects Versions: 2.0.0 Reporter: Junqi Zhang Kafka broker fails to response fetching request from consumer when quota limit is violated(consumer_byte_rate is exceeded) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9234) Consider using @Nullable and @Nonnull annotations
[ https://issues.apache.org/jira/browse/KAFKA-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986825#comment-16986825 ] oulabas commented on KAFKA-9234: Hello [~mjsax], I am a new contributor and I would like to start with this ticket. Can I go ahead and assign this to myself? > Consider using @Nullable and @Nonnull annotations > - > > Key: KAFKA-9234 > URL: https://issues.apache.org/jira/browse/KAFKA-9234 > Project: Kafka > Issue Type: Improvement > Components: admin, clients, consumer, KafkaConnect, producer , > streams, streams-test-utils >Reporter: Matthias J. Sax >Priority: Minor > Labels: beginner, newbie > > Java7 was dropped some time ago, and we might want to consider usein Java8 > `@Nullable` and `@Nonnull` annotations for all public facing APIs instead of > documenting it in JavaDocs only. > This tickets should be broken down in a series of smaller PRs to keep the > scope of each PR contained, allowing for more effective reviews. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Watzke updated KAFKA-9203: Description: I run kafka cluster 2.1.1 when I upgraded the consumer app to use kafka-client 2.3.0 (or 2.3.1) instead of 2.2.0, I immediately started getting the following exceptions in a loop when consuming a topic with LZ4-compressed messages: {noformat} 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred while polling and processing messages: org.apache.kafka.common.KafkaExce ption: Received exception when fetching the next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue consumption. org.apache.kafka.common.KafkaException: Received exception when fetching the next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue consumption. at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) at org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) at com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) at com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) at com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) at com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) at resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) at scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) at scala.util.control.Exception$Catch.apply(Exception.scala:228) at scala.util.control.Exception$Catch.either(Exception.scala:252) at resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) at resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) at resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) at resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) at resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) at resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) at resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) at resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) at resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) at resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) at com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) at com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18) at resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) at scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) at scala.util.control.Exception$Catch.apply(Exception.scala:228) at scala.util.control.Exception$Catch.either(Exception.scala:252) at resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) at resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) at resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) at resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) at resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) at resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) at resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) at resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) at resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) at resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) at com.avast.filerep.saver.RequestSaver$.$anonfun$main$4(RequestSaver.scala:18) at
[jira] [Commented] (KAFKA-9203) kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1
[ https://issues.apache.org/jira/browse/KAFKA-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986765#comment-16986765 ] David Watzke commented on KAFKA-9203: - [~ijuma] Thank you. With the attached clients jar it works fine! So it really is caused by this commit. > kafka-client 2.3.1 fails to consume lz4 compressed topic in kafka 2.1.1 > --- > > Key: KAFKA-9203 > URL: https://issues.apache.org/jira/browse/KAFKA-9203 > Project: Kafka > Issue Type: Bug > Components: compression, consumer >Affects Versions: 2.3.0, 2.3.1 >Reporter: David Watzke >Priority: Critical > Attachments: kafka-clients-2.3.2-SNAPSHOT.jar > > > I run kafka cluster 2.1.1 > when I upgraded a client app to use kafka-client 2.3.0 (or 2.3.1) instead of > 2.2.0, I immediately started getting the following exceptions in a loop when > consuming a topic with LZ4-compressed messages: > {noformat} > 2019-11-18 11:59:16.888 ERROR [afka-consumer-thread] > com.avast.utils2.kafka.consumer.ConsumerThread: Unexpected exception occurred > while polling and processing messages: org.apache.kafka.common.KafkaExce > ption: Received exception when fetching the next record from > FILEREP_LONER_REQUEST-4. If needed, please seek past the record to continue > consumption. > org.apache.kafka.common.KafkaException: Received exception when fetching the > next record from FILEREP_LONER_REQUEST-4. If needed, please seek past the > record to continue consumption. > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1473) > > at > org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1600(Fetcher.java:1332) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:645) > > at > org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:606) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1263) > > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1225) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1201) > at > com.avast.utils2.kafka.consumer.ConsumerThread.pollAndProcessMessagesFromKafka(ConsumerThread.java:180) > > at > com.avast.utils2.kafka.consumer.ConsumerThread.run(ConsumerThread.java:149) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8(RequestSaver.scala:28) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$8$adapted(RequestSaver.scala:19) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at > resource.ManagedResourceOperations.apply$(ManagedResourceOperations.scala:26) > at > resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) > at > resource.ManagedResourceOperations.acquireAndGet(ManagedResourceOperations.scala:25) > > at > resource.ManagedResourceOperations.acquireAndGet$(ManagedResourceOperations.scala:25) > > at > resource.AbstractManagedResource.acquireAndGet(AbstractManagedResource.scala:50) > > at > resource.ManagedResourceOperations.foreach(ManagedResourceOperations.scala:53) > > at > resource.ManagedResourceOperations.foreach$(ManagedResourceOperations.scala:53) > > at > resource.AbstractManagedResource.foreach(AbstractManagedResource.scala:50) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6(RequestSaver.scala:19) > at > com.avast.filerep.saver.RequestSaver$.$anonfun$main$6$adapted(RequestSaver.scala:18) > > at > resource.AbstractManagedResource.$anonfun$acquireFor$1(AbstractManagedResource.scala:88) > > at > scala.util.control.Exception$Catch.$anonfun$either$1(Exception.scala:252) > at scala.util.control.Exception$Catch.apply(Exception.scala:228) > at scala.util.control.Exception$Catch.either(Exception.scala:252) > at > resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) > at > resource.ManagedResourceOperations.apply(ManagedResourceOperations.scala:26) > at >
[jira] [Commented] (KAFKA-9071) transactional.id.expiration.ms config value should be implemented as a Long
[ https://issues.apache.org/jira/browse/KAFKA-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986735#comment-16986735 ] Matthias J. Sax commented on KAFKA-9071: [~despondency] We are usually somewhat lazy to update the tickets and you can do it yourself though – however, there is no "review" stage define in this Jira – you can hit "submit patch" though to mark it as "PR available". Best way to get attention is to tag potential reviews (ie, usually comitters) on github in a comment and ask for a review. > transactional.id.expiration.ms config value should be implemented as a Long > --- > > Key: KAFKA-9071 > URL: https://issues.apache.org/jira/browse/KAFKA-9071 > Project: Kafka > Issue Type: Improvement > Components: config >Affects Versions: 2.3.0 >Reporter: Joost van de Wijgerd >Assignee: Mario Georgiev >Priority: Major > > Currently the value of this config parameter is limited to MAX_INT > effectively limiting the transactional id expiration to ~ 25 days. This is > causing some issues for us on our Acceptance environment (which is not used > that often / heavily) where our transactional services will start failing > because if this issue. > I believe best practice for millisecond values should be to implement them as > a Long and not as an Integer > this is currently the max value: transactional.id.expiration.ms=2147483647 > while I would like to set it to: transactional.id.expiration.ms=3154000 > (i.e. 1 year) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9252) Kafka Connect fails to create connector if single-broker Kafka cluster is configured for offsets.topic.replication.factor=3
[ https://issues.apache.org/jira/browse/KAFKA-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin Moffatt updated KAFKA-9252: - Summary: Kafka Connect fails to create connector if single-broker Kafka cluster is configured for offsets.topic.replication.factor=3 (was: Kafka Connect ) > Kafka Connect fails to create connector if single-broker Kafka cluster is > configured for offsets.topic.replication.factor=3 > --- > > Key: KAFKA-9252 > URL: https://issues.apache.org/jira/browse/KAFKA-9252 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Affects Versions: 2.3.1 >Reporter: Robin Moffatt >Priority: Minor > > If I mis-configure my *single* Kafka broker with > `offsets.topic.replication.factor=3` (the default), Kafka Connect will start > up absolutely fine (Kafka Connect started in the log file, `/connectors` > endpoint returns HTTP 200). But if I try to create a connector, it > (eventually) returns > {code:java} > {"error_code":500,"message":"Request timed out"}{code} > There's no error in the Kafka Connect worker log at INFO level. More details: > [https://rmoff.net/2019/11/29/kafka-connect-request-timed-out/] > This could be improved. Either at startup ensure that the Kafka consumer > offsets topic is available and not startup if it's not, or at least log why > the connector failed to be created. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-6718) Rack Aware Stand-by Task Assignment for Kafka Streams
[ https://issues.apache.org/jira/browse/KAFKA-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986715#comment-16986715 ] Levani Kokhreidze commented on KAFKA-6718: -- Hi [~vvcephei] Would appreciate your thoughts on this ticket if there're anything specific you think we must take into account while implementing this new feature. KIP-441 has a lot of work with standby tasks, wondering if we should unify this in the KIP-441 (pretty sure we shouldn't, thinking out loud), should we wait for KIP-441, or they can be done in parallel? > Rack Aware Stand-by Task Assignment for Kafka Streams > - > > Key: KAFKA-6718 > URL: https://issues.apache.org/jira/browse/KAFKA-6718 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Deepak Goyal >Priority: Major > Labels: needs-kip > > |Machines in data centre are sometimes grouped in racks. Racks provide > isolation as each rack may be in a different physical location and has its > own power source. When tasks are properly replicated across racks, it > provides fault tolerance in that if a rack goes down, the remaining racks can > continue to serve traffic. > > This feature is already implemented at Kafka > [KIP-36|https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment] > but we needed similar for task assignments at Kafka Streams Application > layer. > > This features enables replica tasks to be assigned on different racks for > fault-tolerance. > NUM_STANDBY_REPLICAS = x > totalTasks = x+1 (replica + active) > # If there are no rackID provided: Cluster will behave rack-unaware > # If same rackId is given to all the nodes: Cluster will behave rack-unaware > # If (totalTasks <= number of racks), then Cluster will be rack aware i.e. > each replica task is each assigned to a different rack. > # Id (totalTasks > number of racks), then it will first assign tasks on > different racks, further tasks will be assigned to least loaded node, cluster > wide.| > We have added another config in StreamsConfig called "RACK_ID_CONFIG" which > helps StickyPartitionAssignor to assign tasks in such a way that no two > replica tasks are on same rack if possible. > Post that it also helps to maintain stickyness with-in the rack.| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KAFKA-6718) Rack Aware Stand-by Task Assignment for Kafka Streams
[ https://issues.apache.org/jira/browse/KAFKA-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986715#comment-16986715 ] Levani Kokhreidze edited comment on KAFKA-6718 at 12/3/19 8:36 AM: --- Hi [~vvcephei] Would appreciate your thoughts on this ticket if there is anything specific you think we must take into account while implementing this new feature. KIP-441 has a lot of work with standby tasks, wondering if we should unify this in the KIP-441 (pretty sure we shouldn't, thinking out loud), should we wait for KIP-441, or they can be done in parallel? was (Author: lkokhreidze): Hi [~vvcephei] Would appreciate your thoughts on this ticket if there're anything specific you think we must take into account while implementing this new feature. KIP-441 has a lot of work with standby tasks, wondering if we should unify this in the KIP-441 (pretty sure we shouldn't, thinking out loud), should we wait for KIP-441, or they can be done in parallel? > Rack Aware Stand-by Task Assignment for Kafka Streams > - > > Key: KAFKA-6718 > URL: https://issues.apache.org/jira/browse/KAFKA-6718 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Deepak Goyal >Priority: Major > Labels: needs-kip > > |Machines in data centre are sometimes grouped in racks. Racks provide > isolation as each rack may be in a different physical location and has its > own power source. When tasks are properly replicated across racks, it > provides fault tolerance in that if a rack goes down, the remaining racks can > continue to serve traffic. > > This feature is already implemented at Kafka > [KIP-36|https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment] > but we needed similar for task assignments at Kafka Streams Application > layer. > > This features enables replica tasks to be assigned on different racks for > fault-tolerance. > NUM_STANDBY_REPLICAS = x > totalTasks = x+1 (replica + active) > # If there are no rackID provided: Cluster will behave rack-unaware > # If same rackId is given to all the nodes: Cluster will behave rack-unaware > # If (totalTasks <= number of racks), then Cluster will be rack aware i.e. > each replica task is each assigned to a different rack. > # Id (totalTasks > number of racks), then it will first assign tasks on > different racks, further tasks will be assigned to least loaded node, cluster > wide.| > We have added another config in StreamsConfig called "RACK_ID_CONFIG" which > helps StickyPartitionAssignor to assign tasks in such a way that no two > replica tasks are on same rack if possible. > Post that it also helps to maintain stickyness with-in the rack.| -- This message was sent by Atlassian Jira (v8.3.4#803005)