[jira] [Resolved] (KAFKA-8252) topic catalog.offer-updated.1 has an unexpected message
[ https://issues.apache.org/jira/browse/KAFKA-8252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zan fu resolved KAFKA-8252. --- Resolution: Not A Bug > topic catalog.offer-updated.1 has an unexpected message > --- > > Key: KAFKA-8252 > URL: https://issues.apache.org/jira/browse/KAFKA-8252 > Project: Kafka > Issue Type: Bug >Reporter: zan fu >Priority: Major > > refer to https://issues.apache.org/jira/browse/KAFKA-4740's -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8248) Producer may fail IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820743#comment-16820743 ] Ayusman Dikshit commented on KAFKA-8248: Does this issue ONLY apply to 2.0.0 or later versions as well? > Producer may fail IllegalStateException > --- > > Key: KAFKA-8248 > URL: https://issues.apache.org/jira/browse/KAFKA-8248 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 2.0.0 >Reporter: Matthias J. Sax >Priority: Major > > In a Kafka Streams application, we observed the following log from the > producer: > {quote}2019-04-17T01:58:25.898Z 17466081 [kafka-producer-network-thread | > client-id-enrichment-gcoint-StreamThread-7-0_10-producer] ERROR > org.apache.kafka.clients.producer.internals.Sender - [Producer > clientId=client-id-enrichment-gcoint-StreamThread-7-0_10-producer, > transactionalId=application-id-enrichment-gcoint-0_10] Uncaught error in > kafka producer I/O thread: > 2019-04-17T01:58:25.898Z java.lang.IllegalStateException: Attempt to send a > request to node 1 which is not ready. > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:430) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.NetworkClient.send(NetworkClient.java:411) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.maybeSendTransactionalRequest(Sender.java:362) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:214) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) > 2019-04-17T01:58:25.898Z at java.lang.Thread.run(Thread.java:748) > {quote} > Later, Kafka Streams (running with EOS enabled) shuts down with a > `TimeoutException` that occurs during rebalance. It seem that the above error > results in this `TimeoutException`. However, and `IllegalStateException` seem > to indicate a bug in the producer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8252) topic catalog.offer-updated.1 has an unexpected message
zan fu created KAFKA-8252: - Summary: topic catalog.offer-updated.1 has an unexpected message Key: KAFKA-8252 URL: https://issues.apache.org/jira/browse/KAFKA-8252 Project: Kafka Issue Type: Bug Reporter: zan fu refer to https://issues.apache.org/jira/browse/KAFKA-4740's -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8246) refactor topic/group instance id validation condition
[ https://issues.apache.org/jira/browse/KAFKA-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820708#comment-16820708 ] Boyang Chen commented on KAFKA-8246: Interesting, thanks for the info! [~sliebau] Cc [~hachikuji] > refactor topic/group instance id validation condition > - > > Key: KAFKA-8246 > URL: https://issues.apache.org/jira/browse/KAFKA-8246 > Project: Kafka > Issue Type: Improvement >Reporter: Boyang Chen >Assignee: Boyang Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8153) Streaming application with state stores takes up to 1 hour to restart
[ https://issues.apache.org/jira/browse/KAFKA-8153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820680#comment-16820680 ] Matthias J. Sax commented on KAFKA-8153: [~mmelsen] Any updates on this? I am wondering if you agree with my assessment or not? > Streaming application with state stores takes up to 1 hour to restart > - > > Key: KAFKA-8153 > URL: https://issues.apache.org/jira/browse/KAFKA-8153 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0 >Reporter: Michael Melsen >Priority: Major > > We are using spring cloud stream with Kafka streams 2.0.1 and utilizing the > InteractiveQueryService to fetch data from the stores. There are 4 stores > that persist data on disk after aggregating data. The code for the topology > looks like this: > {code:java} > @Slf4j > @EnableBinding(SensorMeasurementBinding.class) > public class Consumer { > public static final String RETENTION_MS = "retention.ms"; > public static final String CLEANUP_POLICY = "cleanup.policy"; > @Value("${windowstore.retention.ms}") > private String retention; > /** > * Process the data flowing in from a Kafka topic. Aggregate the data to: > * - 2 minute > * - 15 minutes > * - one hour > * - 12 hours > * > * @param stream > */ > @StreamListener(SensorMeasurementBinding.ERROR_SCORE_IN) > public void process(KStream stream) { > Map topicConfig = new HashMap<>(); > topicConfig.put(RETENTION_MS, retention); > topicConfig.put(CLEANUP_POLICY, "delete"); > log.info("Changelog and local window store retention.ms: {} and > cleanup.policy: {}", > topicConfig.get(RETENTION_MS), > topicConfig.get(CLEANUP_POLICY)); > createWindowStore(LocalStore.TWO_MINUTES_STORE, topicConfig, stream); > createWindowStore(LocalStore.FIFTEEN_MINUTES_STORE, topicConfig, stream); > createWindowStore(LocalStore.ONE_HOUR_STORE, topicConfig, stream); > createWindowStore(LocalStore.TWELVE_HOURS_STORE, topicConfig, stream); > } > private void createWindowStore( > LocalStore localStore, > Map topicConfig, > KStream stream) { > // Configure how the statestore should be materialized using the provide > storeName > Materialized> materialized > = Materialized > .as(localStore.getStoreName()); > // Set retention of changelog topic > materialized.withLoggingEnabled(topicConfig); > // Configure how windows looks like and how long data will be retained in > local stores > TimeWindows configuredTimeWindows = getConfiguredTimeWindows( > localStore.getTimeUnit(), > Long.parseLong(topicConfig.get(RETENTION_MS))); > // Processing description: > // The input data are 'samples' with key > ::: > // 1. With the map we add the Tag to the key and we extract the error > score from the data > // 2. With the groupByKey we group the data on the new key > // 3. With windowedBy we split up the data in time intervals depending on > the provided LocalStore enum > // 4. With reduce we determine the maximum value in the time window > // 5. Materialized will make it stored in a table > stream > .map(getInstallationAssetModelAlgorithmTagKeyMapper()) > .groupByKey() > .windowedBy(configuredTimeWindows) > .reduce((aggValue, newValue) -> getMaxErrorScore(aggValue, > newValue), materialized); > } > private TimeWindows getConfiguredTimeWindows(long windowSizeMs, long > retentionMs) { > TimeWindows timeWindows = TimeWindows.of(windowSizeMs); > timeWindows.until(retentionMs); > return timeWindows; > } > /** >* Determine the max error score to keep by looking at the aggregated error > signal and >* freshly consumed error signal >* >* @param aggValue >* @param newValue >* @return >*/ > private ErrorScore getMaxErrorScore(ErrorScore aggValue, ErrorScore > newValue) { > if(aggValue.getErrorSignal() > newValue.getErrorSignal()) { > return aggValue; > } > return newValue; > } > private KeyValueMapper KeyValue> > getInstallationAssetModelAlgorithmTagKeyMapper() { > return (s, sensorMeasurement) -> new KeyValue<>(s + "::" + > sensorMeasurement.getT(), > new ErrorScore(sensorMeasurement.getTs(), > sensorMeasurement.getE(), sensorMeasurement.getO())); > } > } > {code} > So we are materializing aggregated data to four different stores after > determining the max value within a specific window for a specific key. Please > note that retention which is set to two months of data and the clean up > policy delete. We don't compact data. > The size of the individual state stores on disk is between 14 to 20 gb of > data. > We
[jira] [Commented] (KAFKA-8155) Update Streams system tests for 2.2.0 and 2.1.1 releases
[ https://issues.apache.org/jira/browse/KAFKA-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820671#comment-16820671 ] ASF GitHub Bot commented on KAFKA-8155: --- mjsax commented on pull request #6597: KAFKA-8155: Add 2.2.0 release to system tests URL: https://github.com/apache/kafka/pull/6597 *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Update Streams system tests for 2.2.0 and 2.1.1 releases > > > Key: KAFKA-8155 > URL: https://issues.apache.org/jira/browse/KAFKA-8155 > Project: Kafka > Issue Type: Task > Components: streams, system tests >Reporter: John Roesler >Assignee: Matthias J. Sax >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8155) Update Streams system tests for 2.2.0 and 2.1.1 releases
[ https://issues.apache.org/jira/browse/KAFKA-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820666#comment-16820666 ] ASF GitHub Bot commented on KAFKA-8155: --- mjsax commented on pull request #6596: KAFKA-8155: Add 2.1.1 release to system tests URL: https://github.com/apache/kafka/pull/6596 *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Update Streams system tests for 2.2.0 and 2.1.1 releases > > > Key: KAFKA-8155 > URL: https://issues.apache.org/jira/browse/KAFKA-8155 > Project: Kafka > Issue Type: Task > Components: streams, system tests >Reporter: John Roesler >Assignee: Matthias J. Sax >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8251) Consider different logging for purgeData
Matthias J. Sax created KAFKA-8251: -- Summary: Consider different logging for purgeData Key: KAFKA-8251 URL: https://issues.apache.org/jira/browse/KAFKA-8251 Project: Kafka Issue Type: Improvement Components: core Reporter: Matthias J. Sax As reported in this SO question [https://stackoverflow.com/questions/55724259/kafka-streams-floods-kafka-logs/55738257#55738257] each time data is purged from a topic a log message is written: > {{[2019-04-17 09:06:16,541] INFO [Log >partition=my-application-KSTREAM-AGGREGATE-STATE-STORE-76-repartition-0, > dir=/opt/kafka/data/logs] Incrementing log start offset to 316423 >(kafka.log.Log) [2019-04-17 09:06:16,545] INFO [Log >partition=my-application-KSTREAM-AGGREGATE-STATE-STORE-33-repartition-2, > dir=/opt/kafka/data/logs] Incrementing log start offset to 3394 >(kafka.log.Log) }} While this is useful for regular log truncation, it seems less useful for explicit purge-data requests. Kafka Streams uses the purge data API on repartition topics aggressively and this can result to verbose logging. I don't know this part of the code, but maybe it's possible to distinguish both cases and only log on time/size based truncation at INFO level and reduce log level for purge-data request to TRACE? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-4212) Add a key-value store that is a TTL persistent cache
[ https://issues.apache.org/jira/browse/KAFKA-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820644#comment-16820644 ] Chris Toomey commented on KAFKA-4212: - Our use case is we're using RocksDB as a key/value store cache of a topic and we need the cache to delete the messages when the underlying topic does (via configured retention time). One way to achieve this would be to have the broker publish tombstone records for the deleted messages or provide a means for some other entity to do so. > Add a key-value store that is a TTL persistent cache > > > Key: KAFKA-4212 > URL: https://issues.apache.org/jira/browse/KAFKA-4212 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.0.1 >Reporter: Elias Levy >Priority: Major > Labels: api > > Some jobs needs to maintain as state a large set of key-values for some > period of time. I.e. they need to maintain a TTL cache of values potentially > larger than memory. > Currently Kafka Streams provides non-windowed and windowed key-value stores. > Neither is an exact fit to this use case. > The {{RocksDBStore}}, a {{KeyValueStore}}, stores one value per key as > required, but does not support expiration. The TTL option of RocksDB is > explicitly not used. > The {{RocksDBWindowsStore}}, a {{WindowsStore}}, can expire items via segment > dropping, but it stores multiple items per key, based on their timestamp. > But this store can be repurposed as a cache by fetching the items in reverse > chronological order and returning the first item found. > KAFKA-2594 introduced a fixed-capacity in-memory LRU caching store, but here > we desire a variable-capacity memory-overflowing TTL caching store. > Although {{RocksDBWindowsStore}} can be repurposed as a cache, it would be > useful to have an official and proper TTL cache API and implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8250) Flaky Test DelegationTokenEndToEndAuthorizationTest#testProduceConsumeViaAssign
Matthias J. Sax created KAFKA-8250: -- Summary: Flaky Test DelegationTokenEndToEndAuthorizationTest#testProduceConsumeViaAssign Key: KAFKA-8250 URL: https://issues.apache.org/jira/browse/KAFKA-8250 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.3.0 Reporter: Matthias J. Sax Fix For: 2.3.0 [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk11/detail/kafka-trunk-jdk11/442/tests] {quote}java.lang.AssertionError: Consumed more records than expected expected:<1> but was:<2> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1288) at kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:460) at kafka.api.EndToEndAuthorizationTest.testProduceConsumeViaAssign(EndToEndAuthorizationTest.scala:209){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7647) Flaky test LogCleanerParameterizedIntegrationTest.testCleansCombinedCompactAndDeleteTopic
[ https://issues.apache.org/jira/browse/KAFKA-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820612#comment-16820612 ] Matthias J. Sax commented on KAFKA-7647: https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3564/tests > Flaky test > LogCleanerParameterizedIntegrationTest.testCleansCombinedCompactAndDeleteTopic > - > > Key: KAFKA-7647 > URL: https://issues.apache.org/jira/browse/KAFKA-7647 > Project: Kafka > Issue Type: Sub-task > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Dong Lin >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0 > > > {code} > kafka.log.LogCleanerParameterizedIntegrationTest > > testCleansCombinedCompactAndDeleteTopic[3] FAILED > java.lang.AssertionError: Contents of the map shouldn't change > expected: (340,340), 5 -> (345,345), 10 -> (350,350), 14 -> > (354,354), 1 -> (341,341), 6 -> (346,346), 9 -> (349,349), 13 -> (353,353), > 2 -> (342,342), 17 -> (357,357), 12 -> (352,352), 7 -> (347,347), 3 -> > (343,343), 18 -> (358,358), 16 -> (356,356), 11 -> (351,351), 8 -> > (348,348), 19 -> (359,359), 4 -> (344,344), 15 -> (355,355))> but > was: (340,340), 5 -> (345,345), 10 -> (350,350), 14 -> (354,354), > 1 -> (341,341), 6 -> (346,346), 9 -> (349,349), 13 -> (353,353), 2 -> > (342,342), 17 -> (357,357), 12 -> (352,352), 7 -> (347,347), 3 -> > (343,343), 18 -> (358,358), 16 -> (356,356), 11 -> (351,351), 99 -> > (299,299), 8 -> (348,348), 19 -> (359,359), 4 -> (344,344), 15 -> > (355,355))> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:118) > at > kafka.log.LogCleanerParameterizedIntegrationTest.testCleansCombinedCompactAndDeleteTopic(LogCleanerParameterizedIntegrationTest.scala:129) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8248) Producer may fail IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820597#comment-16820597 ] Jason Gustafson commented on KAFKA-8248: Looking at the code, I think the problem might be that we don't update the time before calling `client.send()` inside `maybeSendTransactionalRequest`. Basically `NetworkClient.isReady` might return a different value at different times, so if the latest time is not used, it might return incorrectly. > Producer may fail IllegalStateException > --- > > Key: KAFKA-8248 > URL: https://issues.apache.org/jira/browse/KAFKA-8248 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 2.0.0 >Reporter: Matthias J. Sax >Priority: Major > > In a Kafka Streams application, we observed the following log from the > producer: > {quote}2019-04-17T01:58:25.898Z 17466081 [kafka-producer-network-thread | > client-id-enrichment-gcoint-StreamThread-7-0_10-producer] ERROR > org.apache.kafka.clients.producer.internals.Sender - [Producer > clientId=client-id-enrichment-gcoint-StreamThread-7-0_10-producer, > transactionalId=application-id-enrichment-gcoint-0_10] Uncaught error in > kafka producer I/O thread: > 2019-04-17T01:58:25.898Z java.lang.IllegalStateException: Attempt to send a > request to node 1 which is not ready. > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:430) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.NetworkClient.send(NetworkClient.java:411) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.maybeSendTransactionalRequest(Sender.java:362) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:214) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) > 2019-04-17T01:58:25.898Z at java.lang.Thread.run(Thread.java:748) > {quote} > Later, Kafka Streams (running with EOS enabled) shuts down with a > `TimeoutException` that occurs during rebalance. It seem that the above error > results in this `TimeoutException`. However, and `IllegalStateException` seem > to indicate a bug in the producer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8185) Controller becomes stale and not able to failover the leadership for the partitions
[ https://issues.apache.org/jira/browse/KAFKA-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820573#comment-16820573 ] Dhruvil Shah edited comment on KAFKA-8185 at 4/17/19 11:45 PM: --- I investigated this issue with Kang and after looking through older logs, the most logical explanation is that the controller got into an invalid state after the topic znode was deleted directly from ZK. This is not an expected scenario as we expect users to use the `delete-topics` command, AdminClient or the DeleteTopicsRequest to delete a topic. was (Author: dhruvilshah): I investigated this issue with Kang and after looking through older logs, the most logical explanation is that the controller got into an invalid state after the topic znode was deleted from ZK. This is not an expected scenario as we expect users to use the `delete-topics` command, AdminClient or the DeleteTopicsRequest to delete a topic. > Controller becomes stale and not able to failover the leadership for the > partitions > --- > > Key: KAFKA-8185 > URL: https://issues.apache.org/jira/browse/KAFKA-8185 > Project: Kafka > Issue Type: Bug > Components: controller >Affects Versions: 1.1.1 >Reporter: Kang H Lee >Priority: Critical > Attachments: broker12.zip, broker9.zip, zookeeper.zip > > > Description: > After broker 9 went offline, all partitions led by it went offline. The > controller attempted to move leadership but ran into an exception while doing > so: > {code:java} > // [2019-03-26 01:23:34,114] ERROR [PartitionStateMachine controllerId=12] > Error while moving some partitions to OnlinePartition state > (kafka.controller.PartitionStateMachine) > java.util.NoSuchElementException: key not found: me-test-1 > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:59) > at scala.collection.mutable.HashMap.apply(HashMap.scala:65) > at > kafka.controller.PartitionStateMachine$$anonfun$14.apply(PartitionStateMachine.scala:202) > at > kafka.controller.PartitionStateMachine$$anonfun$14.apply(PartitionStateMachine.scala:202) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > kafka.controller.PartitionStateMachine.initializeLeaderAndIsrForPartitions(PartitionStateMachine.scala:202) > at > kafka.controller.PartitionStateMachine.doHandleStateChanges(PartitionStateMachine.scala:167) > at > kafka.controller.PartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:116) > at > kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:106) > at > kafka.controller.KafkaController.kafka$controller$KafkaController$$onReplicasBecomeOffline(KafkaController.scala:437) > at > kafka.controller.KafkaController.kafka$controller$KafkaController$$onBrokerFailure(KafkaController.scala:405) > at > kafka.controller.KafkaController$BrokerChange$.process(KafkaController.scala:1246) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply$mcV$sp(ControllerEventManager.scala:69) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:69) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:69) > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31) > at > kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:68) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) > {code} > The controller was unable to move leadership of partitions led by broker 9 as > a result. It's worth noting that the controller ran into the same exception > when the broker came back up online. The controller thinks `me-test-1` is a > new partition and when attempting to transition it to an online partition, it > is unable to retrieve its replica assignment from > ControllerContext#partitionReplicaAssignment. I need to look through the code > to figure out if there's a race condition or situations where we remove the > partition from ControllerContext#partitionReplicaAssignment but might still > leave it in PartitionStateMachine#partitionState. > They had to change the controller to recover from the offline status. > Sequential event:
[jira] [Commented] (KAFKA-7652) Kafka Streams Session store performance degradation from 0.10.2.2 to 0.11.0.0
[ https://issues.apache.org/jira/browse/KAFKA-7652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820588#comment-16820588 ] ASF GitHub Bot commented on KAFKA-7652: --- guozhangwang commented on pull request #6448: KAFKA-7652: Restrict range of fetch/findSessions in cache URL: https://github.com/apache/kafka/pull/6448 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Kafka Streams Session store performance degradation from 0.10.2.2 to 0.11.0.0 > - > > Key: KAFKA-7652 > URL: https://issues.apache.org/jira/browse/KAFKA-7652 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0, 0.11.0.1, 0.11.0.2, 0.11.0.3, 1.1.1, 2.0.0, > 2.0.1 >Reporter: Jonathan Gordon >Assignee: Guozhang Wang >Priority: Major > Labels: kip > Attachments: 0.10.2.1-NamedCache.txt, 2.2.0-rc0_b-NamedCache.txt, > 2.3.0-7652-NamedCache.txt, kafka_10_2_1_flushes.txt, kafka_11_0_3_flushes.txt > > > I'm creating this issue in response to [~guozhang]'s request on the mailing > list: > [https://lists.apache.org/thread.html/97d620f4fd76be070ca4e2c70e2fda53cafe051e8fc4505dbcca0321@%3Cusers.kafka.apache.org%3E] > We are attempting to upgrade our Kafka Streams application from 0.10.2.1 but > experience a severe performance degradation. The highest amount of CPU time > seems spent in retrieving from the local cache. Here's an example thread > profile with 0.11.0.0: > [https://i.imgur.com/l5VEsC2.png] > When things are running smoothly we're gated by retrieving from the state > store with acceptable performance. Here's an example thread profile with > 0.10.2.1: > [https://i.imgur.com/IHxC2cZ.png] > Some investigation reveals that it appears we're performing about 3 orders > magnitude more lookups on the NamedCache over a comparable time period. I've > attached logs of the NamedCache flush logs for 0.10.2.1 and 0.11.0.3. > We're using session windows and have the app configured for > commit.interval.ms = 30 * 1000 and cache.max.bytes.buffering = 10485760 > I'm happy to share more details if they would be helpful. Also happy to run > tests on our data. > I also found this issue, which seems like it may be related: > https://issues.apache.org/jira/browse/KAFKA-4904 > > KIP-420: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-420%3A+Add+Single+Value+Fetch+in+Session+Stores] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8185) Controller becomes stale and not able to failover the leadership for the partitions
[ https://issues.apache.org/jira/browse/KAFKA-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820573#comment-16820573 ] Dhruvil Shah edited comment on KAFKA-8185 at 4/17/19 11:42 PM: --- I investigated this issue with Kang and after looking through older logs, the most logical explanation is that the controller got into an invalid state after the topic znode was deleted from ZK. This is not an expected scenario as we expect users to use the `delete-topics` command, AdminClient or the DeleteTopicsRequest to delete a topic. was (Author: dhruvilshah): This is not a typically expected scenario and would only happen when the topic znode is deleted directly from ZK. > Controller becomes stale and not able to failover the leadership for the > partitions > --- > > Key: KAFKA-8185 > URL: https://issues.apache.org/jira/browse/KAFKA-8185 > Project: Kafka > Issue Type: Bug > Components: controller >Affects Versions: 1.1.1 >Reporter: Kang H Lee >Priority: Critical > Attachments: broker12.zip, broker9.zip, zookeeper.zip > > > Description: > After broker 9 went offline, all partitions led by it went offline. The > controller attempted to move leadership but ran into an exception while doing > so: > {code:java} > // [2019-03-26 01:23:34,114] ERROR [PartitionStateMachine controllerId=12] > Error while moving some partitions to OnlinePartition state > (kafka.controller.PartitionStateMachine) > java.util.NoSuchElementException: key not found: me-test-1 > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:59) > at scala.collection.mutable.HashMap.apply(HashMap.scala:65) > at > kafka.controller.PartitionStateMachine$$anonfun$14.apply(PartitionStateMachine.scala:202) > at > kafka.controller.PartitionStateMachine$$anonfun$14.apply(PartitionStateMachine.scala:202) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > kafka.controller.PartitionStateMachine.initializeLeaderAndIsrForPartitions(PartitionStateMachine.scala:202) > at > kafka.controller.PartitionStateMachine.doHandleStateChanges(PartitionStateMachine.scala:167) > at > kafka.controller.PartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:116) > at > kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:106) > at > kafka.controller.KafkaController.kafka$controller$KafkaController$$onReplicasBecomeOffline(KafkaController.scala:437) > at > kafka.controller.KafkaController.kafka$controller$KafkaController$$onBrokerFailure(KafkaController.scala:405) > at > kafka.controller.KafkaController$BrokerChange$.process(KafkaController.scala:1246) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply$mcV$sp(ControllerEventManager.scala:69) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:69) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:69) > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31) > at > kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:68) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) > {code} > The controller was unable to move leadership of partitions led by broker 9 as > a result. It's worth noting that the controller ran into the same exception > when the broker came back up online. The controller thinks `me-test-1` is a > new partition and when attempting to transition it to an online partition, it > is unable to retrieve its replica assignment from > ControllerContext#partitionReplicaAssignment. I need to look through the code > to figure out if there's a race condition or situations where we remove the > partition from ControllerContext#partitionReplicaAssignment but might still > leave it in PartitionStateMachine#partitionState. > They had to change the controller to recover from the offline status. > Sequential event: > * Broker 9 got restated in between : 2019-03-26 01:22:54,236 - 2019-03-26 > 01:27:30,967: This was unclean shutdown. > * From 2019-03-26 01:27:30,967, broker 9 was rebuilding indexes. Broker 9 > wasn't able to process data at this
[jira] [Created] (KAFKA-8249) partition reassignment may never finish if topic deletion completes first
xiongqi wu created KAFKA-8249: - Summary: partition reassignment may never finish if topic deletion completes first Key: KAFKA-8249 URL: https://issues.apache.org/jira/browse/KAFKA-8249 Project: Kafka Issue Type: Bug Reporter: xiongqi wu Assignee: xiongqi wu kafka allows topic deletion to complete successfully when there are pending partition reassignments of the same topics. (if topic deletion request comes after partition reassignment). This leads several issues: 1) pending partition reassignments of deleted topic never complete because the topic is deleted. 2) onPartitionReassignment -> updateAssignedReplicasForPartition will throw out IllegalStateException for non-existing node. This in turns causes controller not to resume topic deletion for online broker and also fail to register broker notification handler (etc.) during onBrokerStartup. To fix, we need to clean up pending partition reassignment during topic deletion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8185) Controller becomes stale and not able to failover the leadership for the partitions
[ https://issues.apache.org/jira/browse/KAFKA-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhruvil Shah resolved KAFKA-8185. - Resolution: Not A Problem This is not a typically expected scenario and would only happen when the topic znode is deleted directly from ZK. > Controller becomes stale and not able to failover the leadership for the > partitions > --- > > Key: KAFKA-8185 > URL: https://issues.apache.org/jira/browse/KAFKA-8185 > Project: Kafka > Issue Type: Bug > Components: controller >Affects Versions: 1.1.1 >Reporter: Kang H Lee >Priority: Critical > Attachments: broker12.zip, broker9.zip, zookeeper.zip > > > Description: > After broker 9 went offline, all partitions led by it went offline. The > controller attempted to move leadership but ran into an exception while doing > so: > {code:java} > // [2019-03-26 01:23:34,114] ERROR [PartitionStateMachine controllerId=12] > Error while moving some partitions to OnlinePartition state > (kafka.controller.PartitionStateMachine) > java.util.NoSuchElementException: key not found: me-test-1 > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:59) > at scala.collection.mutable.HashMap.apply(HashMap.scala:65) > at > kafka.controller.PartitionStateMachine$$anonfun$14.apply(PartitionStateMachine.scala:202) > at > kafka.controller.PartitionStateMachine$$anonfun$14.apply(PartitionStateMachine.scala:202) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > kafka.controller.PartitionStateMachine.initializeLeaderAndIsrForPartitions(PartitionStateMachine.scala:202) > at > kafka.controller.PartitionStateMachine.doHandleStateChanges(PartitionStateMachine.scala:167) > at > kafka.controller.PartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:116) > at > kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:106) > at > kafka.controller.KafkaController.kafka$controller$KafkaController$$onReplicasBecomeOffline(KafkaController.scala:437) > at > kafka.controller.KafkaController.kafka$controller$KafkaController$$onBrokerFailure(KafkaController.scala:405) > at > kafka.controller.KafkaController$BrokerChange$.process(KafkaController.scala:1246) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply$mcV$sp(ControllerEventManager.scala:69) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:69) > at > kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:69) > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31) > at > kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:68) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) > {code} > The controller was unable to move leadership of partitions led by broker 9 as > a result. It's worth noting that the controller ran into the same exception > when the broker came back up online. The controller thinks `me-test-1` is a > new partition and when attempting to transition it to an online partition, it > is unable to retrieve its replica assignment from > ControllerContext#partitionReplicaAssignment. I need to look through the code > to figure out if there's a race condition or situations where we remove the > partition from ControllerContext#partitionReplicaAssignment but might still > leave it in PartitionStateMachine#partitionState. > They had to change the controller to recover from the offline status. > Sequential event: > * Broker 9 got restated in between : 2019-03-26 01:22:54,236 - 2019-03-26 > 01:27:30,967: This was unclean shutdown. > * From 2019-03-26 01:27:30,967, broker 9 was rebuilding indexes. Broker 9 > wasn't able to process data at this moment. > * At 2019-03-26 01:29:36,741, broker 9 was starting to load replica. > * [2019-03-26 01:29:36,202] ERROR [KafkaApi-9] Number of alive brokers '0' > does not meet the required replication factor '3' for the offsets topic > (configured via 'offsets.topic.replication.factor'). This error can be > ignored if the cluster is starting up and not all brokers are up yet. > (kafka.server.KafkaApis) > * At 2019-03-26
[jira] [Updated] (KAFKA-8248) Producer may fail IllegalStateException
[ https://issues.apache.org/jira/browse/KAFKA-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8248: --- Description: In a Kafka Streams application, we observed the following log from the producer: {quote}2019-04-17T01:58:25.898Z 17466081 [kafka-producer-network-thread | client-id-enrichment-gcoint-StreamThread-7-0_10-producer] ERROR org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=client-id-enrichment-gcoint-StreamThread-7-0_10-producer, transactionalId=application-id-enrichment-gcoint-0_10] Uncaught error in kafka producer I/O thread: 2019-04-17T01:58:25.898Z java.lang.IllegalStateException: Attempt to send a request to node 1 which is not ready. 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:430) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.NetworkClient.send(NetworkClient.java:411) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.maybeSendTransactionalRequest(Sender.java:362) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:214) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) 2019-04-17T01:58:25.898Z at java.lang.Thread.run(Thread.java:748) {quote} Later, Kafka Streams (running with EOS enabled) shuts down with a `TimeoutException` that occurs during rebalance. It seem that the above error results in this `TimeoutException`. However, and `IllegalStateException` seem to indicate a bug in the producer. was: In a Kafka Streams application, we observed the following log from the producer: {quote}2019-04-17T01:58:25.898Z 17466081 [kafka-producer-network-thread | client-id-enrichment-gcoint-StreamThread-7-0_10-producer] ERROR org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=client-id-enrichment-gcoint-StreamThread-7-0_10-producer, transactionalId=application-id-enrichment-gcoint-0_10] Uncaught error in kafka producer I/O thread: 2019-04-17T01:58:25.898Z java.lang.IllegalStateException: Attempt to send a request to node 1 which is not ready. 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:430) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.NetworkClient.send(NetworkClient.java:411) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.maybeSendTransactionalRequest(Sender.java:362) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:214) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) 2019-04-17T01:58:25.898Z at java.lang.Thread.run(Thread.java:748) {quote} Later, Kafka Streams shuts down with a `TimeoutException` that occurs during rebalance. It seem that the above error results in this `TimeoutException`. However, and `IllegalStateException` seem to indicate a bug in the producer. > Producer may fail IllegalStateException > --- > > Key: KAFKA-8248 > URL: https://issues.apache.org/jira/browse/KAFKA-8248 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 2.0.0 >Reporter: Matthias J. Sax >Priority: Major > > In a Kafka Streams application, we observed the following log from the > producer: > {quote}2019-04-17T01:58:25.898Z 17466081 [kafka-producer-network-thread | > client-id-enrichment-gcoint-StreamThread-7-0_10-producer] ERROR > org.apache.kafka.clients.producer.internals.Sender - [Producer > clientId=client-id-enrichment-gcoint-StreamThread-7-0_10-producer, > transactionalId=application-id-enrichment-gcoint-0_10] Uncaught error in > kafka producer I/O thread: > 2019-04-17T01:58:25.898Z java.lang.IllegalStateException: Attempt to send a > request to node 1 which is not ready. > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:430) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.NetworkClient.send(NetworkClient.java:411) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.maybeSendTransactionalRequest(Sender.java:362) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:214) > 2019-04-17T01:58:25.898Z at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) > 2019-04-17T01:58:25.898Z at java.lang.Thread.run(Thread.java:748) > {quote} > Later, Kafka Streams (running with EOS enabled) shuts down with a > `TimeoutException` that occurs during rebalance. It seem that the above error > results in this `TimeoutException`. However, and `IllegalStateException` seem > to indicate a bug in the producer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8248) Producer may fail IllegalStateException
Matthias J. Sax created KAFKA-8248: -- Summary: Producer may fail IllegalStateException Key: KAFKA-8248 URL: https://issues.apache.org/jira/browse/KAFKA-8248 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 2.0.0 Reporter: Matthias J. Sax In a Kafka Streams application, we observed the following log from the producer: {quote}2019-04-17T01:58:25.898Z 17466081 [kafka-producer-network-thread | client-id-enrichment-gcoint-StreamThread-7-0_10-producer] ERROR org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=client-id-enrichment-gcoint-StreamThread-7-0_10-producer, transactionalId=application-id-enrichment-gcoint-0_10] Uncaught error in kafka producer I/O thread: 2019-04-17T01:58:25.898Z java.lang.IllegalStateException: Attempt to send a request to node 1 which is not ready. 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:430) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.NetworkClient.send(NetworkClient.java:411) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.maybeSendTransactionalRequest(Sender.java:362) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:214) 2019-04-17T01:58:25.898Z at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) 2019-04-17T01:58:25.898Z at java.lang.Thread.run(Thread.java:748) {quote} Later, Kafka Streams shuts down with a `TimeoutException` that occurs during rebalance. It seem that the above error results in this `TimeoutException`. However, and `IllegalStateException` seem to indicate a bug in the producer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820523#comment-16820523 ] Patrik Kleindl commented on KAFKA-5998: --- [~guozhang] Could you please share the direction(s) you are currently investigating? Is the problem only with the file or is the whole directory affected? And if the latter, how come this does not impact stream processing? > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at >
[jira] [Commented] (KAFKA-6520) When a Kafka Stream can't communicate with the server, it's Status stays RUNNING
[ https://issues.apache.org/jira/browse/KAFKA-6520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820488#comment-16820488 ] ASF GitHub Bot commented on KAFKA-6520: --- ConcurrencyPractitioner commented on pull request #6594: [KAFKA-6520] Add DISCONNECTED state to Kafka Streams URL: https://github.com/apache/kafka/pull/6594 We wish to add a DISCONNECTED state to Kafka Streams. Please see KIP-457 for details. https://cwiki.apache.org/confluence/display/KAFKA/KIP-457%3A+Add+DISCONNECTED+status+to+Kafka+Streams ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > When a Kafka Stream can't communicate with the server, it's Status stays > RUNNING > > > Key: KAFKA-6520 > URL: https://issues.apache.org/jira/browse/KAFKA-6520 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Michael Kohout >Assignee: Milind Jain >Priority: Major > Labels: newbie, user-experience > > When you execute the following scenario the application is always in RUNNING > state > > 1)start kafka > 2)start app, app connects to kafka and starts processing > 3)kill kafka(stop docker container) > 4)the application doesn't give any indication that it's no longer > connected(Stream State is still RUNNING, and the uncaught exception handler > isn't invoked) > > > It would be useful if the Stream State had a DISCONNECTED status. > > See > [this|https://groups.google.com/forum/#!topic/confluent-platform/nQh2ohgdrIQ] > for a discussion from the google user forum. > [This|https://issues.apache.org/jira/browse/KAFKA-4564] is a link to a > related issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8147) Add changelog topic configuration to KTable suppress
[ https://issues.apache.org/jira/browse/KAFKA-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820474#comment-16820474 ] ASF GitHub Bot commented on KAFKA-8147: --- mjduijn commented on pull request #6593: [WIP] KAFKA-8147 Add changelog topic configuration to KTable suppress URL: https://github.com/apache/kafka/pull/6593 KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-446%3A+Add+changelog+topic+configuration+to+KTable+suppress Jira: https://issues.apache.org/jira/browse/KAFKA-8147 Have not had time to do a full implementation, still lacking: * Tests * Proper handling of type topicConfig type erasure by `BufferConfigInternal` ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add changelog topic configuration to KTable suppress > > > Key: KAFKA-8147 > URL: https://issues.apache.org/jira/browse/KAFKA-8147 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 2.1.1 >Reporter: Maarten >Assignee: Maarten >Priority: Minor > Labels: needs-kip > > The streams DSL does not provide a way to configure the changelog topic > created by KTable.suppress. > From the perspective of an external user this could be implemented similar to > the configuration of aggregate + materialized, i.e., > {code:java} > changelogTopicConfigs = // Configs > materialized = Materialized.as(..).withLoggingEnabled(changelogTopicConfigs) > .. > KGroupedStream.aggregate(..,materialized) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8241) Dynamic update of keystore fails on listener without truststore
[ https://issues.apache.org/jira/browse/KAFKA-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajini Sivaram resolved KAFKA-8241. --- Resolution: Fixed Reviewer: Manikumar Fix Version/s: 2.1.2 > Dynamic update of keystore fails on listener without truststore > --- > > Key: KAFKA-8241 > URL: https://issues.apache.org/jira/browse/KAFKA-8241 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.1, 2.0.1, 2.2.0, 2.1.1 >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram >Priority: Major > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > Validation of dynamically updated keystores and truststores assumes that both > are present. On brokers with only keystores and no truststore configured, > dynamic update fails with NPE. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7965) Flaky Test ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
[ https://issues.apache.org/jira/browse/KAFKA-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-7965: --- Fix Version/s: (was: 2.2.1) > Flaky Test > ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup > > > Key: KAFKA-7965 > URL: https://issues.apache.org/jira/browse/KAFKA-7965 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, unit tests >Affects Versions: 1.1.1, 2.2.0, 2.3.0 >Reporter: Matthias J. Sax >Assignee: huxihx >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/] > {quote}java.lang.AssertionError: Received 0, expected at least 68 at > org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.api.ConsumerBounceTest.receiveAndCommit(ConsumerBounceTest.scala:557) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1(ConsumerBounceTest.scala:320) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1$adapted(ConsumerBounceTest.scala:319) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(ConsumerBounceTest.scala:319){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-7965) Flaky Test ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
[ https://issues.apache.org/jira/browse/KAFKA-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-7965. Resolution: Fixed > Flaky Test > ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup > > > Key: KAFKA-7965 > URL: https://issues.apache.org/jira/browse/KAFKA-7965 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, unit tests >Affects Versions: 1.1.1, 2.2.0, 2.3.0 >Reporter: Matthias J. Sax >Assignee: huxihx >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0, 2.2.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/] > {quote}java.lang.AssertionError: Received 0, expected at least 68 at > org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.api.ConsumerBounceTest.receiveAndCommit(ConsumerBounceTest.scala:557) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1(ConsumerBounceTest.scala:320) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1$adapted(ConsumerBounceTest.scala:319) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(ConsumerBounceTest.scala:319){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7965) Flaky Test ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
[ https://issues.apache.org/jira/browse/KAFKA-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820445#comment-16820445 ] ASF GitHub Bot commented on KAFKA-7965: --- hachikuji commented on pull request #6557: KAFKA-7965: Fix testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup URL: https://github.com/apache/kafka/pull/6557 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Flaky Test > ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup > > > Key: KAFKA-7965 > URL: https://issues.apache.org/jira/browse/KAFKA-7965 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, unit tests >Affects Versions: 1.1.1, 2.2.0, 2.3.0 >Reporter: Matthias J. Sax >Assignee: huxihx >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0, 2.2.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/] > {quote}java.lang.AssertionError: Received 0, expected at least 68 at > org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.api.ConsumerBounceTest.receiveAndCommit(ConsumerBounceTest.scala:557) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1(ConsumerBounceTest.scala:320) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1$adapted(ConsumerBounceTest.scala:319) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(ConsumerBounceTest.scala:319){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8155) Update Streams system tests for 2.2.0 and 2.1.1 releases
[ https://issues.apache.org/jira/browse/KAFKA-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reassigned KAFKA-8155: -- Assignee: Matthias J. Sax (was: John Roesler) > Update Streams system tests for 2.2.0 and 2.1.1 releases > > > Key: KAFKA-8155 > URL: https://issues.apache.org/jira/browse/KAFKA-8155 > Project: Kafka > Issue Type: Task > Components: streams, system tests >Reporter: John Roesler >Assignee: Matthias J. Sax >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8025) Flaky Test RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest#shouldForwardAllDbOptionsCalls
[ https://issues.apache.org/jira/browse/KAFKA-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820402#comment-16820402 ] ASF GitHub Bot commented on KAFKA-8025: --- bbejeck commented on pull request #6370: KAFKA-8025: Update regex to allow any chars after ":" URL: https://github.com/apache/kafka/pull/6370 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Flaky Test > RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest#shouldForwardAllDbOptionsCalls > > > Key: KAFKA-8025 > URL: https://issues.apache.org/jira/browse/KAFKA-8025 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Affects Versions: 2.3.0 >Reporter: Konstantine Karantasis >Assignee: Bill Bejeck >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0 > > > At least one occurence where the following unit test case failed on a jenkins > job that didn't involve any related changes. > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/2783/consoleFull] > I have not been able to reproduce it locally on Linux. (For instance 20 > consecutive runs of this class pass all test cases) > {code:java} > 14:06:13 > org.apache.kafka.streams.state.internals.RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest > > shouldForwardAllDbOptionsCalls STARTED 14:06:14 > org.apache.kafka.streams.state.internals.RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest.shouldForwardAllDbOptionsCalls > failed, log available in > /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/streams/build/reports/testOutput/org.apache.kafka.streams.state.internals.RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest.shouldForwardAllDbOptionsCalls.test.stdout > 14:06:14 14:06:14 > org.apache.kafka.streams.state.internals.RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest > > shouldForwardAllDbOptionsCalls FAILED 14:06:14 > java.lang.AssertionError: 14:06:14 Expected: a string matching the > pattern 'Unexpected method call DBOptions\.baseBackgroundCompactions((.* > 14:06:14 *)*):' 14:06:14 but: was "Unexpected method call > DBOptions.baseBackgroundCompactions():\n DBOptions.close(): expected: 3, > actual: 0" 14:06:14 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) 14:06:14 > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) 14:06:14 > at > org.apache.kafka.streams.state.internals.RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest.verifyDBOptionsMethodCall(RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest.java:121) > 14:06:14 at > org.apache.kafka.streams.state.internals.RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest.shouldForwardAllDbOptionsCalls(RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest.java:101) > 14:06:14 14:06:14 > org.apache.kafka.streams.state.internals.RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest > > shouldForwardAllColumnFamilyCalls STARTED 14:06:14 14:06:14 > org.apache.kafka.streams.state.internals.RocksDBGenericOptionsToDbOptionsColumnFamilyOptionsAdapterTest > > shouldForwardAllColumnFamilyCalls PASSED > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8231) Expansion of ConnectClusterState interface
[ https://issues.apache.org/jira/browse/KAFKA-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Egerton updated KAFKA-8231: - Fix Version/s: (was: 2.3.0) > Expansion of ConnectClusterState interface > -- > > Key: KAFKA-8231 > URL: https://issues.apache.org/jira/browse/KAFKA-8231 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Chris Egerton >Assignee: Chris Egerton >Priority: Major > > This covers [KIP-454: Expansion of the ConnectClusterState > interface|https://cwiki.apache.org/confluence/display/KAFKA/KIP-454%3A+Expansion+of+the+ConnectClusterState+interface] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8215) Limit memory usage of RocksDB
[ https://issues.apache.org/jira/browse/KAFKA-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820353#comment-16820353 ] ASF GitHub Bot commented on KAFKA-8215: --- ableegoldman commented on pull request #6560: KAFKA-8215: Pt I. Share block cache between instances URL: https://github.com/apache/kafka/pull/6560 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Limit memory usage of RocksDB > - > > Key: KAFKA-8215 > URL: https://issues.apache.org/jira/browse/KAFKA-8215 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Sophie Blee-Goldman >Priority: Major > > The memory usage of Streams is currently unbounded in part because of > RocksDB, which consumes memory on a per-instance basis. Each instance (ie > each persistent state store) will have its own write buffer, index blocks, > and block cache. The size of these can be configured individually, but there > is currently no way for a Streams app to limit the total memory available > across instances. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8241) Dynamic update of keystore fails on listener without truststore
[ https://issues.apache.org/jira/browse/KAFKA-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820322#comment-16820322 ] ASF GitHub Bot commented on KAFKA-8241: --- rajinisivaram commented on pull request #6585: KAFKA-8241; Handle configs without truststore for broker keystore update URL: https://github.com/apache/kafka/pull/6585 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Dynamic update of keystore fails on listener without truststore > --- > > Key: KAFKA-8241 > URL: https://issues.apache.org/jira/browse/KAFKA-8241 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.1, 2.0.1, 2.2.0, 2.1.1 >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram >Priority: Major > Fix For: 2.3.0, 2.2.1 > > > Validation of dynamically updated keystores and truststores assumes that both > are present. On brokers with only keystores and no truststore configured, > dynamic update fails with NPE. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-3539) KafkaProducer.send() may block even though it returns the Future
[ https://issues.apache.org/jira/browse/KAFKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Yu updated KAFKA-3539: -- Labels: needs-discussion needs-kip (was: ) > KafkaProducer.send() may block even though it returns the Future > > > Key: KAFKA-3539 > URL: https://issues.apache.org/jira/browse/KAFKA-3539 > Project: Kafka > Issue Type: Bug > Components: producer >Reporter: Oleg Zhurakousky >Priority: Critical > Labels: needs-discussion, needs-kip > > You can get more details from the us...@kafka.apache.org by searching on the > thread with the subject "KafkaProducer block on send". > The bottom line is that method that returns Future must never block, since it > essentially violates the Future contract as it was specifically designed to > return immediately passing control back to the user to check for completion, > cancel etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8247) Duplicate error handling in kafka-server-start.sh and actual Kafka class
Sönke Liebau created KAFKA-8247: --- Summary: Duplicate error handling in kafka-server-start.sh and actual Kafka class Key: KAFKA-8247 URL: https://issues.apache.org/jira/browse/KAFKA-8247 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.1.1 Reporter: Sönke Liebau There is some duplication of error handling for command line parameters that are passed into kafka-server-start.sh The shell script prints an error, if no arguments are passed in, effectively causing the same check in [Kafka|https://github.com/apache/kafka/blob/92db08cba582668d77160b0c2853efd45a1b809b/core/src/main/scala/kafka/Kafka.scala#L43] to never be triggered, unless the only option that is specified is -daemon, which would be removed before passing arguments to the java class. While not in any way critical I don't think that this is intended behavior. I think we should remove the extra check in kafka-server-start.sh and leave argument handling up to the Kafka class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7937) Flaky Test ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup
[ https://issues.apache.org/jira/browse/KAFKA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820256#comment-16820256 ] Matthias J. Sax commented on KAFKA-7937: [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3562/tests] > Flaky Test ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup > > > Key: KAFKA-7937 > URL: https://issues.apache.org/jira/browse/KAFKA-7937 > Project: Kafka > Issue Type: Bug > Components: admin, clients, unit tests >Affects Versions: 2.2.0, 2.1.1, 2.3.0 >Reporter: Matthias J. Sax >Assignee: Gwen Shapira >Priority: Critical > Fix For: 2.3.0, 2.1.2, 2.2.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/19/pipeline > {quote}kafka.admin.ResetConsumerGroupOffsetTest > > testResetOffsetsNotExistingGroup FAILED > java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.CoordinatorNotAvailableException: The > coordinator is not available. at > org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) > at > org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) > at > org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89) > at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:306) > at > kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup(ResetConsumerGroupOffsetTest.scala:89) > Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: > The coordinator is not available.{quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8041) Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll
[ https://issues.apache.org/jira/browse/KAFKA-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820255#comment-16820255 ] Matthias J. Sax commented on KAFKA-8041: One more: [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/249/tests] > Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll > - > > Key: KAFKA-8041 > URL: https://issues.apache.org/jira/browse/KAFKA-8041 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.0.1, 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.0.2, 2.3.0 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/236/tests] > {quote}java.lang.AssertionError: Expected some messages > at kafka.utils.TestUtils$.fail(TestUtils.scala:357) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:787) > at > kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:189) > at > kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63){quote} > STDOUT > {quote}[2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, > leaderId=0, fetcherId=0] Error for partition topic-6 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-10 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-4 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-8 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-2 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:45:00,248] ERROR Error while rolling log segment for topic-0 > in dir > /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216 > (kafka.server.LogDirFailureChannel:76) > java.io.FileNotFoundException: > /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216/topic-0/.index > (Not a directory) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.(RandomAccessFile.java:243) > at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:121) > at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251) > at kafka.log.AbstractIndex.resize(AbstractIndex.scala:115) > at kafka.log.AbstractIndex.$anonfun$trimToValidSize$1(AbstractIndex.scala:184) > at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251) > at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:184) > at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:501) > at kafka.log.Log.$anonfun$roll$8(Log.scala:1520) > at kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1520) > at scala.Option.foreach(Option.scala:257) > at kafka.log.Log.$anonfun$roll$2(Log.scala:1520) > at kafka.log.Log.maybeHandleIOException(Log.scala:1881) > at kafka.log.Log.roll(Log.scala:1484) > at > kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:154) > at > kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at >
[jira] [Commented] (KAFKA-7965) Flaky Test ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
[ https://issues.apache.org/jira/browse/KAFKA-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820248#comment-16820248 ] Matthias J. Sax commented on KAFKA-7965: [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/20974/testReport/junit/kafka.api/ConsumerBounceTest/testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup/] > Flaky Test > ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup > > > Key: KAFKA-7965 > URL: https://issues.apache.org/jira/browse/KAFKA-7965 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, unit tests >Affects Versions: 1.1.1, 2.2.0, 2.3.0 >Reporter: Matthias J. Sax >Assignee: huxihx >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0, 2.2.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/] > {quote}java.lang.AssertionError: Received 0, expected at least 68 at > org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.api.ConsumerBounceTest.receiveAndCommit(ConsumerBounceTest.scala:557) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1(ConsumerBounceTest.scala:320) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1$adapted(ConsumerBounceTest.scala:319) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(ConsumerBounceTest.scala:319){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6455) Improve timestamp propagation at DSL level
[ https://issues.apache.org/jira/browse/KAFKA-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820247#comment-16820247 ] ASF GitHub Bot commented on KAFKA-6455: --- mjsax commented on pull request #6565: KAFKA-6455: KStream-KStream join should set max timestamp for result record URL: https://github.com/apache/kafka/pull/6565 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve timestamp propagation at DSL level > -- > > Key: KAFKA-6455 > URL: https://issues.apache.org/jira/browse/KAFKA-6455 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 1.0.0 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Major > Labels: needs-kip > > At DSL level, we inherit the timestamp propagation "contract" from the > Processor API. This contract in not optimal at DSL level, and we should > define a DSL level contract that matches the semantics of the corresponding > DSL operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8187) State store record loss across multiple reassignments when using standby tasks
[ https://issues.apache.org/jira/browse/KAFKA-8187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reassigned KAFKA-8187: -- Assignee: Matthias J. Sax (was: Bill Bejeck) > State store record loss across multiple reassignments when using standby tasks > -- > > Key: KAFKA-8187 > URL: https://issues.apache.org/jira/browse/KAFKA-8187 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.0.1 >Reporter: William Greer >Assignee: Matthias J. Sax >Priority: Major > > Overview: > There is a race condition that can cause a partitioned state store to be > missing records up to an offset when using standby tasks. > When a reassignment occurs and a task is migrated to a StandbyTask in another > StreamThread/TaskManager on the same JVM, there can be lock contention that > prevents the StandbyTask on the currently assigned StreamThread from > acquiring the lock and to not retry acquiring the lock because all of the > active StreamTasks are running for that StreamThread. If the StandbyTask does > not acquire the lock before the StreamThread enters into the RUNNING state, > then the StandbyTask will not consume any records. If there is no subsequent > reassignment before the second execution of the stateDirCleaner Thread, then > the task directory for the StandbyTask will be deleted. When the next > reassignment occurs the offset that was read by the StandbyTask at creation > time before acquiring the lock will be written back to the state store > directory, this re-creates the state store directory. > An example: > StreamThread(A) and StreamThread(B) are running on the same JVM in the same > streams application. > StreamThread(A) has StandbyTask 1_0 > StreamThread(B) has no tasks > A reassignment is triggered by another host in the streams application fleet. > StreamThread(A) is notified with a PARTITIONS_REVOKED event of the threads > one task > StreamThread(B) is notified with a PARTITIONS_ASSIGNED event of a standby > task for 1_0 > Here begins the race condition. > StreamThread(B) creates the StandbyTask which reads the current checkpoint > from disk. > StreamThread(B) then attempts to updateNewAndRestoringTasks() for it's > assigned tasks. [0] > StreamThread(B) initializes the new tasks for the active and standby tasks. > [1] [2] > StreamThread(B) attempts to lock the state directory for task 1_0 but fails > with a LockException [3], since StreamThread(A) still holds the lock. > StreamThread(B) returns true from updateNewAndRestoringTasks() due to the > check at [4] which only checks that the active assigned tasks are running. > StreamThread(B) state is set to RUNNING > StreamThread(A) closes the previous StandbyTask specifically calling > closeStateManager() [5] > StreamThread(A) state is set to RUNNING > Streams application for this host has completed re-balancing and is now in > the RUNNING state. > State at this point is the following: State directory exists for 1_0 and all > data is present. > Then at a period that is 1 to 2 intervals of [6](which is default of 10 > minutes) after the reassignment had completed the stateDirCleaner thread will > execute [7]. > The stateDirCleaner will then do [8], which finds the directory 1_0, finds > that there isn't an active lock for that directory, acquire the lock, and > deletes the directory. > State at this point is the following: State directory does not exist for 1_0. > When the next reassignment occurs. The offset that was read by > StreamThread(B) during construction of the StandbyTask for 1_0 will be > written back to disk. This write re-creates the state store directory and > writes the .checkpoint file with the old offset. > State at this point is the following: State directory exists for 1_0 with a > '.checkpoint' file in it, but there is no other state store data in the > directory. > If this host is assigned the active task for 1_0 then all the history in the > state store will be missing from before the offset that was read at the > previous reassignment. > If this host is assigned the standby task for 1_0 then the lock will be > acquired and the standby will start to consume records, but it will still be > missing all records from before the offset that was read at the previous > reassignment. > If this host is not assigned 1_0, then the state directory will get cleaned > up by the stateDirCleaner thread 10 to 20 minutes later and the record loss > issue will be hidden. > [0] > https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L865-L869 > [1] >
[jira] [Commented] (KAFKA-6951) Implement offset expiration semantics for unsubscribed topics
[ https://issues.apache.org/jira/browse/KAFKA-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820132#comment-16820132 ] Emmanuel Brard commented on KAFKA-6951: --- [~wushujames] it's not implemented yet: {code:java} root@kafka-1:/# kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group my_favorite_group TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID bar 0 58 58 0 kafka-python-1.4.6-5f66aa52-8497-404c-9c3e-69f36139b30d /172.19.0.5 kafka-python-1.4.6 foo 0 56 58 2 - - - {code} {code:java} root@kafka-1:/# kafka-consumer-groups --bootstrap-server localhost:9091 --group my_favorite_group --topic foo --delete The consumer does not support topic-specific offset deletion from a consumer group. {code} See https://github.com/apache/kafka/blob/47a9871ef65a13f9d58d5ea216de340f7e123da5/core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala#L941 > Implement offset expiration semantics for unsubscribed topics > - > > Key: KAFKA-6951 > URL: https://issues.apache.org/jira/browse/KAFKA-6951 > Project: Kafka > Issue Type: Improvement > Components: core >Reporter: Vahid Hashemian >Assignee: Vahid Hashemian >Priority: Major > > [This > portion|https://cwiki.apache.org/confluence/display/KAFKA/KIP-211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets#KIP-211:ReviseExpirationSemanticsofConsumerGroupOffsets-UnsubscribingfromaTopic] > of KIP-211 will be implemented separately from the main PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-5609) Connect log4j should log to file by default
[ https://issues.apache.org/jira/browse/KAFKA-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaufman Ng reassigned KAFKA-5609: - Assignee: Kaufman Ng > Connect log4j should log to file by default > --- > > Key: KAFKA-5609 > URL: https://issues.apache.org/jira/browse/KAFKA-5609 > Project: Kafka > Issue Type: Improvement > Components: config, KafkaConnect >Affects Versions: 0.11.0.0 >Reporter: Yeva Byzek >Assignee: Kaufman Ng >Priority: Minor > Labels: easyfix > > {{https://github.com/apache/kafka/blob/trunk/config/connect-log4j.properties}} > Currently logs to stdout. It should also log to a file by default, otherwise > it just writes to console and messages can be lost -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819913#comment-16819913 ] Rabih Hallage commented on KAFKA-5998: -- Hello Guys, I'm experiencing the same issue with the following stack trace (I tried to change the state directory, same thing): {code:java} [2019-04-16 16:42:13,459] WARN [-app-cid-1-StreamThread-1] task [0_0] Failed to write offset checkpoint file to [/opt/ial/kafka-streams/-app_v1/0_0/.checkpoint] (org.apache.kafka.streams.processor.internals.ProcessorStateManager) java.io.FileNotFoundException: /opt/ial/kafka-streams/-app_v1/0_0/.checkpoint.tmp (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79) at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474) at org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:596) at org.apache.kafka.streams.processor.internals.StreamTask.suspend(StreamTask.java:567) at org.apache.kafka.streams.processor.internals.AssignedTasks.suspendTasks(AssignedTasks.java:128) at org.apache.kafka.streams.processor.internals.AssignedTasks.suspend(AssignedTasks.java:97) at org.apache.kafka.streams.processor.internals.TaskManager.suspendTasksAndState(TaskManager.java:242) at org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsRevoked(StreamThread.java:319) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare(ConsumerCoordinator.java:461) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:396) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java) {code} > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) >
[jira] [Commented] (KAFKA-8246) refactor topic/group instance id validation condition
[ https://issues.apache.org/jira/browse/KAFKA-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819899#comment-16819899 ] Sönke Liebau commented on KAFKA-8246: - I believe that used to be a regular expression but was changed for better performance, see this [pull request|https://github.com/apache/kafka/pull/3234]. > refactor topic/group instance id validation condition > - > > Key: KAFKA-8246 > URL: https://issues.apache.org/jira/browse/KAFKA-8246 > Project: Kafka > Issue Type: Improvement >Reporter: Boyang Chen >Assignee: Boyang Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8110) Flaky Test DescribeConsumerGroupTest#testDescribeMembersWithConsumersWithoutAssignedPartitions
[ https://issues.apache.org/jira/browse/KAFKA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819784#comment-16819784 ] Matthias J. Sax commented on KAFKA-8110: Failed again: [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3807/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeOffsetsWithConsumersWithoutAssignedPartitions/] > Flaky Test > DescribeConsumerGroupTest#testDescribeMembersWithConsumersWithoutAssignedPartitions > -- > > Key: KAFKA-8110 > URL: https://issues.apache.org/jira/browse/KAFKA-8110 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0, 2.2.1 > > > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/67/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeMembersWithConsumersWithoutAssignedPartitions/] > {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata > not propagated after 15000 ms at > kafka.utils.TestUtils$.fail(TestUtils.scala:381) at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at > kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at > kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at > scala.collection.immutable.Range.foreach(Range.scala:158) at > scala.collection.TraversableLike.map(TraversableLike.scala:237) at > scala.collection.TraversableLike.map$(TraversableLike.scala:230) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at > kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at > kafka.admin.DescribeConsumerGroupTest.testDescribeMembersWithConsumersWithoutAssignedPartitions(DescribeConsumerGroupTest.scala:372){quote} > STDOUT > {quote}[2019-03-14 20:01:52,347] WARN Ignoring unexpected runtime exception > (org.apache.zookeeper.server.NIOServerCnxnFactory:236) > java.nio.channels.CancelledKeyException at > sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at > sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:205) > at java.lang.Thread.run(Thread.java:748) TOPIC PARTITION CURRENT-OFFSET > LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - - TOPIC > PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 > 0 0 0 - - - COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS > localhost:44669 (0){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7965) Flaky Test ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
[ https://issues.apache.org/jira/browse/KAFKA-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819781#comment-16819781 ] Matthias J. Sax commented on KAFKA-7965: [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3807/testReport/junit/kafka.api/ConsumerBounceTest/testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup/] > Flaky Test > ConsumerBounceTest#testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup > > > Key: KAFKA-7965 > URL: https://issues.apache.org/jira/browse/KAFKA-7965 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, unit tests >Affects Versions: 1.1.1, 2.2.0, 2.3.0 >Reporter: Matthias J. Sax >Assignee: huxihx >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0, 2.2.1 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/] > {quote}java.lang.AssertionError: Received 0, expected at least 68 at > org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.api.ConsumerBounceTest.receiveAndCommit(ConsumerBounceTest.scala:557) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1(ConsumerBounceTest.scala:320) > at > kafka.api.ConsumerBounceTest.$anonfun$testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup$1$adapted(ConsumerBounceTest.scala:319) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup(ConsumerBounceTest.scala:319){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8032) Flaky Test UserQuotaTest#testQuotaOverrideDelete
[ https://issues.apache.org/jira/browse/KAFKA-8032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819767#comment-16819767 ] Manikumar commented on KAFKA-8032: -- [~mjsax] Sorry for the delay. Will take a look next week. > Flaky Test UserQuotaTest#testQuotaOverrideDelete > > > Key: KAFKA-8032 > URL: https://issues.apache.org/jira/browse/KAFKA-8032 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.3.0 > > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/2830/testReport/junit/kafka.api/UserQuotaTest/testQuotaOverrideDelete/] > {quote}java.lang.AssertionError: Client with id=QuotasTestProducer-1 should > have been throttled at org.junit.Assert.fail(Assert.java:89) at > org.junit.Assert.assertTrue(Assert.java:42) at > kafka.api.QuotaTestClients.verifyThrottleTimeMetric(BaseQuotaTest.scala:229) > at kafka.api.QuotaTestClients.verifyProduceThrottle(BaseQuotaTest.scala:215) > at > kafka.api.BaseQuotaTest.testQuotaOverrideDelete(BaseQuotaTest.scala:124){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)