[jira] [Commented] (KAFKA-8104) Consumer cannot rejoin to the group after rebalancing
[ https://issues.apache.org/jira/browse/KAFKA-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942526#comment-16942526 ] Nikita Koryabkin commented on KAFKA-8104: - [~kgn] , [~nizhikov] *Version 1.1.0 is also affect*, thanks. [2019-10-01 17:40:33,995] INFO [GroupCoordinator 1001]: Member -2af431fd-60e4-4dd7-a4fd-8dd85d4a5620 in group main has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator) [2019-10-01 17:40:33,995] INFO [GroupCoordinator 1001]: Preparing to rebalance group main with old generation 15 (__consumer_offsets-1) (kafka.coordinator.group.GroupCoordinator) [2019-10-01 17:40:33,995] INFO [GroupCoordinator 1001]: Group main with generation 16 is now empty (__consumer_offsets-1) (kafka.coordinator.group.GroupCoordinator) > Consumer cannot rejoin to the group after rebalancing > - > > Key: KAFKA-8104 > URL: https://issues.apache.org/jira/browse/KAFKA-8104 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.3.0 >Reporter: Gregory Koshelev >Assignee: Nikolay Izhikov >Priority: Critical > Attachments: consumer-rejoin-fail.log > > > TL;DR; {{KafkaConsumer}} cannot rejoin to the group due to inconsistent > {{AbstractCoordinator.generation}} (which is {{NO_GENERATION}} and > {{AbstractCoordinator.joinFuture}} (which is succeeded {{RequestFuture}}). > See explanation below. > There are 16 consumers in single process (threads from pool-4-thread-1 to > pool-4-thread-16). All of them belong to single consumer group > {{hercules.sink.elastic.legacy_logs_elk_c2}}. Rebalancing has been acquired > and consumers have got {{CommitFailedException}} as expected: > {noformat} > 2019-03-10T03:16:37.023Z [pool-4-thread-10] WARN > r.k.vostok.hercules.sink.SimpleSink - Commit failed due to rebalancing > org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be > completed since the group has already rebalanced and assigned the partitions > to another member. This means that the time between subsequent calls to > poll() was longer than the configured max.poll.interval.ms, which typically > implies that the poll loop is spending too much time message processing. You > can address this either by increasing the session timeout or by reducing the > maximum size of batches returned in poll() with max.poll.records. > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:798) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:681) > at > org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1334) > at > org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1298) > at ru.kontur.vostok.hercules.sink.Sink.commit(Sink.java:156) > at ru.kontur.vostok.hercules.sink.SimpleSink.run(SimpleSink.java:104) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > After that, most of them successfully rejoined to the group with generation > 10699: > {noformat} > 2019-03-10T03:16:39.208Z [pool-4-thread-13] INFO > o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-13, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Successfully joined group > with generation 10699 > 2019-03-10T03:16:39.209Z [pool-4-thread-13] INFO > o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-13, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Setting newly assigned > partitions [legacy_logs_elk_c2-18] > ... > 2019-03-10T03:16:39.216Z [pool-4-thread-11] INFO > o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-11, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Successfully joined group > with generation 10699 > 2019-03-10T03:16:39.217Z [pool-4-thread-11] INFO > o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-11, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Setting newly assigned > partitions [legacy_logs_elk_c2-10, legacy_logs_elk_c2-11] > ... > 2019-03-10T03:16:39.218Z [pool-4-thread-15] INFO > o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-15, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Setting newly assigned > partitions [legacy_logs_elk_c2-24] > 2019-03-10T03:16:42.320Z [kafka-coordinator-heartbeat-thread | > hercules.sink.elastic.legacy_logs_elk_c2] INFO > o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-6, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Attempt to h
[jira] [Commented] (KAFKA-5609) Connect log4j should log to file by default
[ https://issues.apache.org/jira/browse/KAFKA-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942454#comment-16942454 ] ASF GitHub Bot commented on KAFKA-5609: --- kkonstantine commented on pull request #7430: KAFKA-5609: Connect log4j should also log to a file by default (KIP-521) URL: https://github.com/apache/kafka/pull/7430 Enable Kafka Connect to redirect log4j messages to a file by default, additionally to the redirection to standard output. The file-based log4j export is set to be daily and shares the same pattern with the stdout appender. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Connect log4j should log to file by default > --- > > Key: KAFKA-5609 > URL: https://issues.apache.org/jira/browse/KAFKA-5609 > Project: Kafka > Issue Type: Improvement > Components: config, KafkaConnect >Affects Versions: 0.11.0.0 >Reporter: Yeva Byzek >Assignee: Kaufman Ng >Priority: Minor > Labels: easyfix > > {{https://github.com/apache/kafka/blob/trunk/config/connect-log4j.properties}} > Currently logs to stdout. It should also log to a file by default, otherwise > it just writes to console and messages can be lost -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-8595) Support SerDe of Decimals in JSON that are not HEX encoded
[ https://issues.apache.org/jira/browse/KAFKA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch resolved KAFKA-8595. -- Fix Version/s: 2.4.0 Reviewer: Randall Hauch Resolution: Fixed KIP-481 was approved, and this PR was merged to the `trunk` branch, which is the branch that 2.4.0 will be based on. > Support SerDe of Decimals in JSON that are not HEX encoded > -- > > Key: KAFKA-8595 > URL: https://issues.apache.org/jira/browse/KAFKA-8595 > Project: Kafka > Issue Type: Improvement >Reporter: Almog Gavra >Assignee: Almog Gavra >Priority: Major > Fix For: 2.4.0 > > > Most JSON data that utilizes precise decimal data represents it as a decimal > string. Kafka Connect, on the other hand, only supports a binary HEX string > encoding (see example below). We should support deserialization and > serialization for any of the following types: > {code:java} > { > "asHex": "D3J5", > "asString": "10.12345" > "asNumber": 10.2345 > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8887) Use purgatory for CreateAcls and DeleteAcls if implementation is async
[ https://issues.apache.org/jira/browse/KAFKA-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942325#comment-16942325 ] ASF GitHub Bot commented on KAFKA-8887: --- rajinisivaram commented on pull request #7404: KAFKA-8887; Use purgatory for ACL updates using async authorizers URL: https://github.com/apache/kafka/pull/7404 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Use purgatory for CreateAcls and DeleteAcls if implementation is async > -- > > Key: KAFKA-8887 > URL: https://issues.apache.org/jira/browse/KAFKA-8887 > Project: Kafka > Issue Type: Sub-task > Components: core >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram >Priority: Major > Fix For: 2.4.0 > > > KAFKA-8886 is updating Authorizer.createAcls and Authorizer.deleteAcls APIs > to be asynchronous to avoid blocking request threads during ACL updates when > implementations use external stores like databases where updates may block > for long. This Jira is to async updates using a purgatory in KafkaApis. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8595) Support SerDe of Decimals in JSON that are not HEX encoded
[ https://issues.apache.org/jira/browse/KAFKA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942296#comment-16942296 ] ASF GitHub Bot commented on KAFKA-8595: --- rhauch commented on pull request #7354: KAFKA-8595: Support deserialization of JSON decimals encoded in NUMERIC URL: https://github.com/apache/kafka/pull/7354 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support SerDe of Decimals in JSON that are not HEX encoded > -- > > Key: KAFKA-8595 > URL: https://issues.apache.org/jira/browse/KAFKA-8595 > Project: Kafka > Issue Type: Improvement >Reporter: Almog Gavra >Assignee: Almog Gavra >Priority: Major > > Most JSON data that utilizes precise decimal data represents it as a decimal > string. Kafka Connect, on the other hand, only supports a binary HEX string > encoding (see example below). We should support deserialization and > serialization for any of the following types: > {code:java} > { > "asHex": "D3J5", > "asString": "10.12345" > "asNumber": 10.2345 > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8966) Stream state does not transition to RUNNING on client, broker consumer group shows RUNNING
[ https://issues.apache.org/jira/browse/KAFKA-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raman Gupta updated KAFKA-8966: --- Description: I have a Kafka stream that has been running fine until recently. The new behavior I see is that the stream state on the client goes from CREATED to REBALANCING, but never transitions from REBALANCING to RUNNING. However, at the same time, if I look at the offsets of the corresponding consumer group, the consumer group appears to be consuming from the topic and has no lag. And yet, the client never made a state change to RUNNING. This is confirmed by calling `streams.close` on the stream and noting the state change goes from REBALANCING to PENDING_SHUTDOWN instead of RUNNING to PENDING_SHUTDOWN as expected. I use the state change to enable queries on the stream store -- if the state change listener never triggers to the RUNNING state, there is no way to know when the client is available for queries. Yes, I have confirmed its the correct consumer group. Yes, the consumer group has no consumers when I shut down the client stream. Server logs: kafka-2 kafka 2019-10-01T16:59:36.348859731Z [2019-10-01 16:59:36,348] INFO [GroupCoordinator 2]: Preparing to rebalance group arena-rg-uiService-fileStatusStore-stream in state PreparingRebalance with old generation 0 (__consumer_offsets-42) (reason: Adding new member arena-rg-uiService-fileStatusStore-stream-0a954f60-f8a3-4f13-8d9e-6caa63773dd2-StreamThread-1-consumer-325a6889-659f-48cb-b308-0d626b573944 with group instanceid None) (kafka.coordinator.group.GroupCoordinator) kafka-2 kafka 2019-10-01T17:00:06.349171842Z [2019-10-01 17:00:06,348] INFO [GroupCoordinator 2]: Stabilized group arena-rg-uiService-fileStatusStore-stream generation 1 (__consumer_offsets-42) (kafka.coordinator.group.GroupCoordinator) kafka-2 kafka 2019-10-01T17:00:06.604980028Z [2019-10-01 17:00:06,604] INFO [GroupCoordinator 2]: Assignment received from leader for group arena-rg-uiService-fileStatusStore-stream for generation 1 (kafka.coordinator.group.GroupCoordinator) was: I have a Kafka stream that has been running fine until recently. The new behavior I see is that the stream state on the client goes from CREATED to REBALANCING, but never transitions from REBALANCING to RUNNING. However, at the same time, if I look at the offsets of the corresponding consumer group, the consumer group appears to be consuming from the topic and has no lag. And yet, the client never made a state change to RUNNING. This is confirmed by calling `streams.close` on the stream and noting the state change goes from REBALANCING to PENDING_SHUTDOWN instead of RUNNING to PENDING_SHUTDOWN as expected. I use the state change to enable queries on the stream store -- if the state change listener never triggers to the RUNNING state, there is no way to know when the client is available for queries. Yes, I have confirmed its the correct consumer group. Yes, the consumer group has no consumers when I shut down the client stream. > Stream state does not transition to RUNNING on client, broker consumer group > shows RUNNING > -- > > Key: KAFKA-8966 > URL: https://issues.apache.org/jira/browse/KAFKA-8966 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.3.0 >Reporter: Raman Gupta >Priority: Critical > > I have a Kafka stream that has been running fine until recently. The new > behavior I see is that the stream state on the client goes from CREATED to > REBALANCING, but never transitions from REBALANCING to RUNNING. > However, at the same time, if I look at the offsets of the corresponding > consumer group, the consumer group appears to be consuming from the topic and > has no lag. And yet, the client never made a state change to RUNNING. This is > confirmed by calling `streams.close` on the stream and noting the state > change goes from REBALANCING to PENDING_SHUTDOWN instead of RUNNING to > PENDING_SHUTDOWN as expected. > I use the state change to enable queries on the stream store -- if the state > change listener never triggers to the RUNNING state, there is no way to know > when the client is available for queries. > Yes, I have confirmed its the correct consumer group. Yes, the consumer group > has no consumers when I shut down the client stream. > Server logs: > kafka-2 kafka 2019-10-01T16:59:36.348859731Z [2019-10-01 16:59:36,348] INFO > [GroupCoordinator 2]: Preparing to rebalance group > arena-rg-uiService-fileStatusStore-stream in state PreparingRebalance with > old generation 0 (__consumer_offsets-42) (reason: Adding new member > arena-rg-uiService-fileStatusStore-stream-0a954f60-f8a3-4f13-8d9e-6caa63773dd2-StreamThread-1-consumer-325a6889-659f-48cb-
[jira] [Created] (KAFKA-8967) Flaky test kafka.api.SaslSslAdminClientIntegrationTest.testCreateTopicsResponseMetadataAndConfig
Stanislav Kozlovski created KAFKA-8967: -- Summary: Flaky test kafka.api.SaslSslAdminClientIntegrationTest.testCreateTopicsResponseMetadataAndConfig Key: KAFKA-8967 URL: https://issues.apache.org/jira/browse/KAFKA-8967 Project: Kafka Issue Type: Improvement Reporter: Stanislav Kozlovski {code:java} java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89) at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) at kafka.api.SaslSslAdminClientIntegrationTest.testCreateTopicsResponseMetadataAndConfig(SaslSslAdminClientIntegrationTest.scala:452) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.{code} Failed in [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/25374] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-8966) Stream state does not transition to RUNNING on client, broker consumer group shows RUNNING
Raman Gupta created KAFKA-8966: -- Summary: Stream state does not transition to RUNNING on client, broker consumer group shows RUNNING Key: KAFKA-8966 URL: https://issues.apache.org/jira/browse/KAFKA-8966 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 2.3.0 Reporter: Raman Gupta I have a Kafka stream that has been running fine until recently. The new behavior I see is that the stream state on the client goes from CREATED to REBALANCING, but never transitions from REBALANCING to RUNNING. However, at the same time, if I look at the offsets of the corresponding consumer group, the consumer group appears to be consuming from the topic and has no lag. And yet, the client never made a state change to RUNNING. This is confirmed by calling `streams.close` on the stream and noting the state change goes from REBALANCING to PENDING_SHUTDOWN instead of RUNNING to PENDING_SHUTDOWN as expected. I use the state change to enable queries on the stream store -- if the state change listener never triggers to the RUNNING state, there is no way to know when the client is available for queries. Yes, I have confirmed its the correct consumer group. Yes, the consumer group has no consumers when I shut down the client stream. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8377) KTable#transformValue might lead to incorrect result in joins
[ https://issues.apache.org/jira/browse/KAFKA-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942039#comment-16942039 ] Aishwarya Pradeep Kumar commented on KAFKA-8377: [~mjsax] thank you for the input, I'm testing the current behaviour, based on this i should be able to fix this bug. > KTable#transformValue might lead to incorrect result in joins > - > > Key: KAFKA-8377 > URL: https://issues.apache.org/jira/browse/KAFKA-8377 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.0.0 >Reporter: Matthias J. Sax >Assignee: Aishwarya Pradeep Kumar >Priority: Major > Labels: newbie++ > > Kafka Streams uses an optimization to not materialize every result KTable. If > a non-materialized KTable is input to a join, the lookup into the table > results in a lookup of the parents table plus a call to the operator. For > example, > {code:java} > KTable nonMaterialized = materializedTable.filter(...); > KTable table2 = ... > table2.join(nonMaterialized,...){code} > If there is a table2 input record, the lookup to the other side is performed > as a lookup into materializedTable plus applying the filter(). > For stateless operation like filter, this is safe. However, > #transformValues() might have an attached state store. Hence, when an input > record r is processed by #transformValues() with current state S, it might > produce an output record r' (that is not materialized). When the join later > does a lookup to get r from the parent table, there is no guarantee that > #transformValues() again produces r' because its state might not be the same > any longer. > Hence, it seems to be required, to always materialize the result of a > KTable#transformValues() operation if there is state. Note, that if there > would be a consecutive filter() after tranformValue(), it would also be ok to > materialize the filter() result. Furthermore, if there is no downstream > join(), materialization is also not required. > Basically, it seems to be unsafe to apply `KTableValueGetter` on a stateful > #transformValues()` operator. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8959) Update Guava to 24 (or newer) or remove dependency
[ https://issues.apache.org/jira/browse/KAFKA-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941866#comment-16941866 ] Sven Lange commented on KAFKA-8959: --- It really looks unmaintained. There are so many open pull requests. There is even a very old pull request containing an upgrade to Guava 25 from May 2018. [https://github.com/ronmamo/reflections/pulls?utf8=%E2%9C%93&q=is%3Aopen+is%3Apr++guava] I send the author a message on twitter: [https://twitter.com/svenlange/status/1179023047203926016] The authors email address is in the pom.xml > Update Guava to 24 (or newer) or remove dependency > -- > > Key: KAFKA-8959 > URL: https://issues.apache.org/jira/browse/KAFKA-8959 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Ismael Juma >Priority: Critical > Fix For: 2.4.0 > > > The reflections library has a dependency on Guava 20.0 and it seems like > there are some issues when running with newer Guava versions: > [https://github.com/ronmamo/reflections/issues/194] > The reflections library hasn't received any code updates since 2017 so it > looks like it's not actively maintained anymore. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8059) Flaky Test DynamicConnectionQuotaTest #testDynamicConnectionQuota
[ https://issues.apache.org/jira/browse/KAFKA-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941864#comment-16941864 ] Bruno Cadonna commented on KAFKA-8059: -- https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/25362/ > Flaky Test DynamicConnectionQuotaTest #testDynamicConnectionQuota > - > > Key: KAFKA-8059 > URL: https://issues.apache.org/jira/browse/KAFKA-8059 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0, 2.1.1 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.4.0 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/46/tests] > {quote}org.scalatest.junit.JUnitTestFailedError: Expected exception > java.io.IOException to be thrown, but no exception was thrown > at > org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:100) > at > org.scalatest.junit.JUnitSuite.newAssertionFailedException(JUnitSuite.scala:71) > at org.scalatest.Assertions$class.intercept(Assertions.scala:822) > at org.scalatest.junit.JUnitSuite.intercept(JUnitSuite.scala:71) > at > kafka.network.DynamicConnectionQuotaTest.testDynamicConnectionQuota(DynamicConnectionQuotaTest.scala:82){quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8964) Refactor Stream-Thread-level Metrics
[ https://issues.apache.org/jira/browse/KAFKA-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941858#comment-16941858 ] ASF GitHub Bot commented on KAFKA-8964: --- cadonna commented on pull request #7429: KAFKA-8964: Rename tag client-id to thread-id for thread-level metric… URL: https://github.com/apache/kafka/pull/7429 …s and below - Renamed tag client-id to thread-id for thread-level metrics and below - Corrected metrics tag keys for state store that had suffix "-id" instead of "state-id" ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Refactor Stream-Thread-level Metrics > - > > Key: KAFKA-8964 > URL: https://issues.apache.org/jira/browse/KAFKA-8964 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Bruno Cadonna >Assignee: Bruno Cadonna >Priority: Major > > Refactor Stream-Thread-level metrics as specified in KIP-444 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8807) Flaky Test GlobalThreadShutDownOrderTest.shouldFinishGlobalStoreOperationOnShutDown
[ https://issues.apache.org/jira/browse/KAFKA-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941849#comment-16941849 ] ASF GitHub Bot commented on KAFKA-8807: --- bbejeck commented on pull request #7418: KAFKA-8807: Flaky GlobalStreamThread test URL: https://github.com/apache/kafka/pull/7418 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Flaky Test > GlobalThreadShutDownOrderTest.shouldFinishGlobalStoreOperationOnShutDown > --- > > Key: KAFKA-8807 > URL: https://issues.apache.org/jira/browse/KAFKA-8807 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Bill Bejeck >Priority: Major > Labels: flaky-test > > [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/24229/testReport/junit/org.apache.kafka.streams.integration/GlobalThreadShutDownOrderTest/shouldFinishGlobalStoreOperationOnShutDown/] > > h3. Error Message > java.lang.AssertionError: expected:<[1, 2, 3, 4]> but was:<[1, 2, 3, 4, 1, 2, > 3, 4]> > h3. Stacktrace > java.lang.AssertionError: expected:<[1, 2, 3, 4]> but was:<[1, 2, 3, 4, 1, 2, > 3, 4]> at org.junit.Assert.fail(Assert.java:89) at > org.junit.Assert.failNotEquals(Assert.java:835) at > org.junit.Assert.assertEquals(Assert.java:120) at > org.junit.Assert.assertEquals(Assert.java:146) at > org.apache.kafka.streams.integration.GlobalThreadShutDownOrderTest.shouldFinishGlobalStoreOperationOnShutDown(GlobalThreadShutDownOrderTest.java:138) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at > org.junit.rules.RunRules.evaluate(RunRules.java:20) at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at > org.junit.runners.ParentRunner.run(ParentRunner.java:412) at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > at > org.gradle.internal.di
[jira] [Commented] (KAFKA-8896) NoSuchElementException after coordinator move
[ https://issues.apache.org/jira/browse/KAFKA-8896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941822#comment-16941822 ] ASF GitHub Bot commented on KAFKA-8896: --- mumrah commented on pull request #7377: KAFKA-8896: Check group state before completing delayed heartbeat URL: https://github.com/apache/kafka/pull/7377 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > NoSuchElementException after coordinator move > - > > Key: KAFKA-8896 > URL: https://issues.apache.org/jira/browse/KAFKA-8896 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.2.0, 2.3.0, 2.2.1 >Reporter: Jason Gustafson >Assignee: Boyang Chen >Priority: Major > Fix For: 2.3.1 > > > Caught this exception in the wild: > {code:java} > java.util.NoSuchElementException: key not found: > consumer-group-38981ebe-4361-44e7-b710-7d11f5d35639 > at scala.collection.MapLike.default(MapLike.scala:235) > at scala.collection.MapLike.default$(MapLike.scala:234) > at scala.collection.AbstractMap.default(Map.scala:63) > at scala.collection.mutable.HashMap.apply(HashMap.scala:69) > at kafka.coordinator.group.GroupMetadata.get(GroupMetadata.scala:214) > at > kafka.coordinator.group.GroupCoordinator.$anonfun$tryCompleteHeartbeat$1(GroupCoordinator.scala:1008) > at > scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > at kafka.coordinator.group.GroupMetadata.inLock(GroupMetadata.scala:209) > at > kafka.coordinator.group.GroupCoordinator.tryCompleteHeartbeat(GroupCoordinator.scala:1001) > at > kafka.coordinator.group.DelayedHeartbeat.tryComplete(DelayedHeartbeat.scala:34) > at > kafka.server.DelayedOperation.maybeTryComplete(DelayedOperation.scala:122) > at > kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:391) > at > kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:295) > at > kafka.coordinator.group.GroupCoordinator.completeAndScheduleNextExpiration(GroupCoordinator.scala:802) > at > kafka.coordinator.group.GroupCoordinator.completeAndScheduleNextHeartbeatExpiration(GroupCoordinator.scala:795) > at > kafka.coordinator.group.GroupCoordinator.$anonfun$handleHeartbeat$2(GroupCoordinator.scala:543) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > at kafka.coordinator.group.GroupMetadata.inLock(GroupMetadata.scala:209) > at > kafka.coordinator.group.GroupCoordinator.handleHeartbeat(GroupCoordinator.scala:516) > at kafka.server.KafkaApis.handleHeartbeatRequest(KafkaApis.scala:1617) > at kafka.server.KafkaApis.handle(KafkaApis.scala:155) {code} > > Looking at the logs, I see a coordinator change just prior to this exception. > The group was first unloaded as the coordinator moved to another broker and > then was loaded again as the coordinator was moved back. I am guessing that > somehow the delayed heartbeat is retaining the reference to the old > GroupMetadata instance. Not sure exactly how this can happen though. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8965) the recording level of record-lateness-[avg|max] is wrong
[ https://issues.apache.org/jira/browse/KAFKA-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junze Bao updated KAFKA-8965: - Description: The document says the metrics is at INFO level but it is actually DEBUG level in the code. (was: I'm running KafkaStreams 2.3.0 and trying to get the metric record-lateness-[avg | max], but it's always NaN. ) Summary: the recording level of record-lateness-[avg|max] is wrong (was: record-lateness-[avg|max] is always NaN) > the recording level of record-lateness-[avg|max] is wrong > - > > Key: KAFKA-8965 > URL: https://issues.apache.org/jira/browse/KAFKA-8965 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Junze Bao >Priority: Major > > The document says the metrics is at INFO level but it is actually DEBUG level > in the code. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-8104) Consumer cannot rejoin to the group after rebalancing
[ https://issues.apache.org/jira/browse/KAFKA-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Izhikov reassigned KAFKA-8104: -- Assignee: Nikolay Izhikov > Consumer cannot rejoin to the group after rebalancing > - > > Key: KAFKA-8104 > URL: https://issues.apache.org/jira/browse/KAFKA-8104 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.3.0 >Reporter: Gregory Koshelev >Assignee: Nikolay Izhikov >Priority: Critical > Attachments: consumer-rejoin-fail.log > > > TL;DR; {{KafkaConsumer}} cannot rejoin to the group due to inconsistent > {{AbstractCoordinator.generation}} (which is {{NO_GENERATION}} and > {{AbstractCoordinator.joinFuture}} (which is succeeded {{RequestFuture}}). > See explanation below. > There are 16 consumers in single process (threads from pool-4-thread-1 to > pool-4-thread-16). All of them belong to single consumer group > {{hercules.sink.elastic.legacy_logs_elk_c2}}. Rebalancing has been acquired > and consumers have got {{CommitFailedException}} as expected: > {noformat} > 2019-03-10T03:16:37.023Z [pool-4-thread-10] WARN > r.k.vostok.hercules.sink.SimpleSink - Commit failed due to rebalancing > org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be > completed since the group has already rebalanced and assigned the partitions > to another member. This means that the time between subsequent calls to > poll() was longer than the configured max.poll.interval.ms, which typically > implies that the poll loop is spending too much time message processing. You > can address this either by increasing the session timeout or by reducing the > maximum size of batches returned in poll() with max.poll.records. > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:798) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:681) > at > org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1334) > at > org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1298) > at ru.kontur.vostok.hercules.sink.Sink.commit(Sink.java:156) > at ru.kontur.vostok.hercules.sink.SimpleSink.run(SimpleSink.java:104) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > After that, most of them successfully rejoined to the group with generation > 10699: > {noformat} > 2019-03-10T03:16:39.208Z [pool-4-thread-13] INFO > o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-13, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Successfully joined group > with generation 10699 > 2019-03-10T03:16:39.209Z [pool-4-thread-13] INFO > o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-13, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Setting newly assigned > partitions [legacy_logs_elk_c2-18] > ... > 2019-03-10T03:16:39.216Z [pool-4-thread-11] INFO > o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-11, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Successfully joined group > with generation 10699 > 2019-03-10T03:16:39.217Z [pool-4-thread-11] INFO > o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-11, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Setting newly assigned > partitions [legacy_logs_elk_c2-10, legacy_logs_elk_c2-11] > ... > 2019-03-10T03:16:39.218Z [pool-4-thread-15] INFO > o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-15, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Setting newly assigned > partitions [legacy_logs_elk_c2-24] > 2019-03-10T03:16:42.320Z [kafka-coordinator-heartbeat-thread | > hercules.sink.elastic.legacy_logs_elk_c2] INFO > o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-6, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Attempt to heartbeat failed > since group is rebalancing > 2019-03-10T03:16:42.320Z [kafka-coordinator-heartbeat-thread | > hercules.sink.elastic.legacy_logs_elk_c2] INFO > o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-5, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Attempt to heartbeat failed > since group is rebalancing > 2019-03-10T03:16:42.323Z [kafka-coordinator-heartbeat-thread | > hercules.sink.elastic.legacy_logs_elk_c2] INFO > o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-7, > groupId=hercules.sink.elastic.legacy_logs_elk_c2] Attempt to heartbeat failed > since group is rebalancing
[jira] [Commented] (KAFKA-8328) Kafka smooth expansion
[ https://issues.apache.org/jira/browse/KAFKA-8328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941695#comment-16941695 ] Jeffrey(Xilang) Yan commented on KAFKA-8328: Hi [~LordChen], do you have code that can work on master branch? > Kafka smooth expansion > -- > > Key: KAFKA-8328 > URL: https://issues.apache.org/jira/browse/KAFKA-8328 > Project: Kafka > Issue Type: Improvement > Components: clients, core >Affects Versions: 0.10.2.0 >Reporter: ChenLin >Priority: Major > Labels: Kafka, expansion > Fix For: 0.10.2.0 > > Attachments: DiskUtil.png, Kafka_smooth_expansion.patch, > producerP999.png > > > When expanding the kafka cluster, the new follower will read the data from > the earliest offset. This can result in a large amount of historical data > being read from the disk, putting a lot of pressure on the disk and affecting > the performance of the kafka service, for example, the producer write latency > will increase. In general, kafka's original expansion mechanism has the > following problems: > 1. The new follower will put a lot of pressure on the disk; > 2. Causes the producer write latency to increase; > 3. Causes the consumer read latency to increase; > In order to solve these problems, we have proposed a solution for > smooth expansion. The main idea of the scheme is that the newly added > follower reads data from the HW position, and when the newly added follower > reads the data to a certain time threshold or data size threshold, the > follower enters the ISR queue. . Since the new follower reads data from the > HW location, most of the data read is in the operating system's cache, so it > does not put pressure on the disk and does not affect the performance of the > kafka service, thus solving the above problems. > In order to illustrate the problems of the original expansion scheme, > we have done some tests, and there are corresponding test charts in the > attachment. > !producerP999.png! > !DiskUtil.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-8965) record-lateness-[avg|max] is always NaN
Junze Bao created KAFKA-8965: Summary: record-lateness-[avg|max] is always NaN Key: KAFKA-8965 URL: https://issues.apache.org/jira/browse/KAFKA-8965 Project: Kafka Issue Type: Bug Affects Versions: 2.3.0 Reporter: Junze Bao I'm running KafkaStreams 2.3.0 and trying to get the metric record-lateness-[avg | max], but it's always NaN. -- This message was sent by Atlassian Jira (v8.3.4#803005)