[jira] [Created] (KAFKA-7575) 'Error while writing to checkpoint file' Issue
Dasun Nirmitha created KAFKA-7575: - Summary: 'Error while writing to checkpoint file' Issue Key: KAFKA-7575 URL: https://issues.apache.org/jira/browse/KAFKA-7575 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 1.1.1 Environment: Windows 10, Kafka 1.1.1 Reporter: Dasun Nirmitha Attachments: Dry run error.rar I'm currently testing a Java Kafka producer application coded to retrieve a db value from a local mysql db and produce to a single topic. Locally I've got a Zookeeper server and a Kafka single broker running. My issue is I need to produce this from the Kafka producer each second, and that works for around 2 hours until broker throws an 'Error while writing to checkpoint file' and shuts down. Producing with a 1 minute interval works with no issues but unfortunately I need the produce interval to be 1 second. I have attached a rar containing screenshots of the Errors thrown from the Broker and my application. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7574) Kafka unable to delete segment file in Linux
[ https://issues.apache.org/jira/browse/KAFKA-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Maas updated KAFKA-7574: Description: The initial error is that Kafka believes it is unable to delete a log segment file. This causes Kafka to mark the log directory as unavailable and eventually shut down without flushing data to disk (which is probably the right thing to do). There are no indications in the OS logs of a failed filesystem or any other OS level issue. We have verified that filesystem is consistent. Is there a IO timeout of some kind that can be adjusted or is something else happening? Potential duplicate of race condition seen in KAFKA-6194. See attached files for config and example log pattern. was: The initial error is that Kafka believes it is unable to delete a log segment file. This causes Kafka to mark the log directory as unavailable and eventually shut down without flushing data to disk. There are no indications in the OS logs of a failed filesystem or any other OS level issue. We have verified that filesystem is consistent. Is there a IO timeout of some kind that can be adjusted or is something else happening? Potential duplicate of race condition seen in KAFKA-6194. See attached files for config and example log pattern. > Kafka unable to delete segment file in Linux > > > Key: KAFKA-7574 > URL: https://issues.apache.org/jira/browse/KAFKA-7574 > Project: Kafka > Issue Type: Bug > Components: core, log >Affects Versions: 1.0.0 > Environment: Debian Stretch >Reporter: Ben Maas >Priority: Major > Attachments: kafka-config.log, kafka-delete_error_pattern.log > > > The initial error is that Kafka believes it is unable to delete a log segment > file. This causes Kafka to mark the log directory as unavailable and > eventually shut down without flushing data to disk (which is probably the > right thing to do). There are no indications in the OS logs of a failed > filesystem or any other OS level issue. We have verified that filesystem is > consistent. > Is there a IO timeout of some kind that can be adjusted or is something else > happening? Potential duplicate of race condition seen in KAFKA-6194. > See attached files for config and example log pattern. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7574) Kafka unable to delete segment file in Linux
Ben Maas created KAFKA-7574: --- Summary: Kafka unable to delete segment file in Linux Key: KAFKA-7574 URL: https://issues.apache.org/jira/browse/KAFKA-7574 Project: Kafka Issue Type: Bug Components: core, log Affects Versions: 1.0.0 Environment: Debian Stretch Reporter: Ben Maas Attachments: kafka-config.log, kafka-delete_error_pattern.log The initial error is that Kafka believes it is unable to delete a log segment file. This causes Kafka to mark the log directory as unavailable and eventually shut down without flushing data to disk. There are no indications in the OS logs of a failed filesystem or any other OS level issue. We have verified that filesystem is consistent. Is there a IO timeout of some kind that can be adjusted or is something else happening? Potential duplicate of race condition seen in KAFKA-6194. See attached files for config and example log pattern. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema
[ https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669391#comment-16669391 ] Dong Lin commented on KAFKA-7481: - [~ijuma] Yeah this works and this is why I think we need two separate upgrade states. Currently there are three possible upgrade state, i.e. binary version is upgraded, binary version + inter.broker.protocol.version are upgraded, and binary version + inter.broker.protocol.version + message.format.version is upgraded. I guess my point is that it is reasonable to keep these three states in the long term. > Consider options for safer upgrade of offset commit value schema > > > Key: KAFKA-7481 > URL: https://issues.apache.org/jira/browse/KAFKA-7481 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Fix For: 2.1.0 > > > KIP-211 and KIP-320 add new versions of the offset commit value schema. The > use of the new schema version is controlled by the > `inter.broker.protocol.version` configuration. Once the new inter-broker > version is in use, it is not possible to downgrade since the older brokers > will not be able to parse the new schema. > The options at the moment are the following: > 1. Do nothing. Users can try the new version and keep > `inter.broker.protocol.version` locked to the old release. Downgrade will > still be possible, but users will not be able to test new capabilities which > depend on inter-broker protocol changes. > 2. Instead of using `inter.broker.protocol.version`, we could use > `message.format.version`. This would basically extend the use of this config > to apply to all persistent formats. The advantage is that it allows users to > upgrade the broker and begin using the new inter-broker protocol while still > allowing downgrade. But features which depend on the persistent format could > not be tested. > Any other options? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema
[ https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669360#comment-16669360 ] Ismael Juma commented on KAFKA-7481: The state where you can roll back is "upgrade binaries" and don't bump either of the configs. :) > Consider options for safer upgrade of offset commit value schema > > > Key: KAFKA-7481 > URL: https://issues.apache.org/jira/browse/KAFKA-7481 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Fix For: 2.1.0 > > > KIP-211 and KIP-320 add new versions of the offset commit value schema. The > use of the new schema version is controlled by the > `inter.broker.protocol.version` configuration. Once the new inter-broker > version is in use, it is not possible to downgrade since the older brokers > will not be able to parse the new schema. > The options at the moment are the following: > 1. Do nothing. Users can try the new version and keep > `inter.broker.protocol.version` locked to the old release. Downgrade will > still be possible, but users will not be able to test new capabilities which > depend on inter-broker protocol changes. > 2. Instead of using `inter.broker.protocol.version`, we could use > `message.format.version`. This would basically extend the use of this config > to apply to all persistent formats. The advantage is that it allows users to > upgrade the broker and begin using the new inter-broker protocol while still > allowing downgrade. But features which depend on the persistent format could > not be tested. > Any other options? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema
[ https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669323#comment-16669323 ] Dong Lin commented on KAFKA-7481: - [~ijuma] Regarding `I am still not sure if there's a lot of value in having two separate upgrade states`, I think we need at least one separate upgrade state for changes that can not be rolled back, since it seems weird not to be able to downgrade if there is only minor version change in the Kafka. And the rational for the second separate upgrade state is that, there are two categories of changes that prevents downgrade, e.g. those that changes topic schema and those that changes message format. It is common for user to be willing to pickup the first category of change very soon, and only pickup the second category of change much later after client library has been upgraded to reduce performance cost in the broker. > Consider options for safer upgrade of offset commit value schema > > > Key: KAFKA-7481 > URL: https://issues.apache.org/jira/browse/KAFKA-7481 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Fix For: 2.1.0 > > > KIP-211 and KIP-320 add new versions of the offset commit value schema. The > use of the new schema version is controlled by the > `inter.broker.protocol.version` configuration. Once the new inter-broker > version is in use, it is not possible to downgrade since the older brokers > will not be able to parse the new schema. > The options at the moment are the following: > 1. Do nothing. Users can try the new version and keep > `inter.broker.protocol.version` locked to the old release. Downgrade will > still be possible, but users will not be able to test new capabilities which > depend on inter-broker protocol changes. > 2. Instead of using `inter.broker.protocol.version`, we could use > `message.format.version`. This would basically extend the use of this config > to apply to all persistent formats. The advantage is that it allows users to > upgrade the broker and begin using the new inter-broker protocol while still > allowing downgrade. But features which depend on the persistent format could > not be tested. > Any other options? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7553) Jenkins PR tests hung
[ https://issues.apache.org/jira/browse/KAFKA-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669285#comment-16669285 ] Jason Gustafson commented on KAFKA-7553: I investigated a few additional hanging builds and also observed `testLowMaxFetchSizeForRequestAndPartition` hanging. Curiously it only seems to happen when the PR builder runs (AFAICT). > Jenkins PR tests hung > - > > Key: KAFKA-7553 > URL: https://issues.apache.org/jira/browse/KAFKA-7553 > Project: Kafka > Issue Type: Bug >Reporter: John Roesler >Priority: Minor > Attachments: consoleText.txt > > > I wouldn't worry about this unless it continues to happen, but I wanted to > document it. > This was a Java 11 build: > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/266/] > It was for this PR: [https://github.com/apache/kafka/pull/5795] > And this commit: > [https://github.com/apache/kafka/pull/5795/commits/5bdcd0e023c6f406d585155399f6541bb6a9f9c2] > > It looks like the tests just hung after 46 minutes, until the build timed out > at 180 minutes. > End of the output: > {noformat} > ... > 00:46:27.275 kafka.server.ServerGenerateBrokerIdTest > > testConsistentBrokerIdFromUserConfigAndMetaProps STARTED > 00:46:29.775 > 00:46:29.775 kafka.server.ServerGenerateBrokerIdTest > > testConsistentBrokerIdFromUserConfigAndMetaProps PASSED > 03:00:51.124 Build timed out (after 180 minutes). Marking the build as > aborted. > 03:00:51.440 Build was aborted > 03:00:51.492 [FINDBUGS] Skipping publisher since build result is ABORTED > 03:00:51.492 Recording test results > 03:00:51.495 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1 > 03:00:58.017 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1 > 03:00:59.330 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1 > 03:00:59.331 Adding one-line test results to commit status... > 03:00:59.332 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1 > 03:00:59.334 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1 > 03:00:59.335 Setting status of 5bdcd0e023c6f406d585155399f6541bb6a9f9c2 to > FAILURE with url https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/266/ > and message: 'FAILURE > 03:00:59.335 9053 tests run, 1 skipped, 0 failed.' > 03:00:59.335 Using context: JDK 11 and Scala 2.12 > 03:00:59.541 Setting GRADLE_4_8_1_HOME=/home/jenkins/tools/gradle/4.8.1 > 03:00:59.542 Finished: ABORTED{noformat} > > I did find one test that started but did not finish: > {noformat} > 00:23:29.576 kafka.api.PlaintextConsumerTest > > testLowMaxFetchSizeForRequestAndPartition STARTED > {noformat} > But note that the tests continued to run for another 23 minutes after this > one started. > > Just for completeness, there were 4 failures: > {noformat} > 00:22:06.875 kafka.admin.ResetConsumerGroupOffsetTest > > testResetOffsetsNotExistingGroup FAILED > 00:22:06.875 java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.CoordinatorNotAvailableException: The > coordinator is not available. > 00:22:06.875 at > org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) > 00:22:06.875 at > org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) > 00:22:06.875 at > org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89) > 00:22:06.876 at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:262) > 00:22:06.876 at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:307) > 00:22:06.876 at > kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup(ResetConsumerGroupOffsetTest.scala:89) > 00:22:06.876 > 00:22:06.876 Caused by: > 00:22:06.876 > org.apache.kafka.common.errors.CoordinatorNotAvailableException: The > coordinator is not available.{noformat} > > {noformat} > 00:25:22.175 kafka.api.CustomQuotaCallbackTest > testCustomQuotaCallback > FAILED > 00:25:22.175 java.lang.AssertionError: Partition [group1_largeTopic,69] > metadata not propagated after 15000 ms > 00:25:22.176 at kafka.utils.TestUtils$.fail(TestUtils.scala:351) > 00:25:22.176 at > kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:741) > 00:25:22.176 at > kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:831) > 00:25:22.176 at > kafka.utils.TestUtils$$anonfun$createTopic$2.apply(TestUtils.scala:330) > 00:25:22.176 at > kafka.utils.TestUtils$$anonfun$createTopic$2.apply(TestUtils.scala:329) > 00:25:22.176 at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > 00:25:22.176 at >
[jira] [Resolved] (KAFKA-7567) Clean up internal metadata usage for consistency and extensibility
[ https://issues.apache.org/jira/browse/KAFKA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-7567. Resolution: Fixed Fix Version/s: 2.2.0 > Clean up internal metadata usage for consistency and extensibility > -- > > Key: KAFKA-7567 > URL: https://issues.apache.org/jira/browse/KAFKA-7567 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 2.2.0 > > > This refactor has two objectives to improve metadata handling logic and > testing: > 1. We want to reduce dependence on the public object `Cluster` for internal > metadata propagation since it is not easy to evolve. As an example, we need > to propagate leader epochs from the metadata response to `Metadata`, but it > is not straightforward to do this without exposing it in `PartitionInfo` > since that is what `Cluster` uses internally. By doing this change, we are > able to remove some redundant `Cluster` building logic. > 2. We want to make the metadata handling in `MockClient` simpler and more > consistent. Currently we have mix of metadata update mechanisms which are > internally inconsistent with each other and also do not match the > implementation in `NetworkClient`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7573) Add an interface that allows broker to intercept every request/response pair
Lincong Li created KAFKA-7573: - Summary: Add an interface that allows broker to intercept every request/response pair Key: KAFKA-7573 URL: https://issues.apache.org/jira/browse/KAFKA-7573 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 2.0.0 Reporter: Lincong Li Assignee: Lincong Li This interface is called "observer" and it opens up several opportunities. One major opportunity is that it enables an auditing system to be built for Kafka deployment. Details are discussed in a KIP. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7572) Producer should not send requests with negative partition id
[ https://issues.apache.org/jira/browse/KAFKA-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669268#comment-16669268 ] ASF GitHub Bot commented on KAFKA-7572: --- yaodong66 opened a new pull request #5858: KAFKA-7572: Producer should not send requests with negative partition id URL: https://github.com/apache/kafka/pull/5858 Partition id should never be a negative value. This commit will make debug easier, when custom Partitioner generate an invalid negative partition id. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Producer should not send requests with negative partition id > > > Key: KAFKA-7572 > URL: https://issues.apache.org/jira/browse/KAFKA-7572 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 1.0.1 >Reporter: Yaodong Yang >Priority: Major > > h3. Issue: > In one Kafka producer log from our users, we found the following weird one: > timestamp="2018-10-09T17:37:41,237-0700",level="ERROR", Message="Write to > Kafka failed with: ",exception="java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for > topicName--2: 30042 ms has passed since batch creation plus linger time > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:94) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64) > at > org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:29) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring 1 > record(s) for topicName--2: 30042 ms has passed since batch creation plus > linger time" > After a few hours debugging, we finally understood the root cause of this > issue: > # The producer used a buggy custom Partitioner, which sometimes generates > negative partition ids for new records. > # The corresponding produce requests were rejected by brokers, because it's > illegal to have a partition with a negative id. > # The client kept refreshing its local cluster metadata, but could not send > produce requests successfully. > # From the above log, we found a suspicious string "topicName--2": > # According to the source code, the format of this string in the log is > TopicName+"-"+PartitionId. > # It's not easy to notice that there were 2 consecutive dash in the above > log. > # Eventually, we found that the second dash was a negative sign. Therefore, > the partition id is -2, rather than 2. > # The bug the custom Partitioner. > h3. Proposal: > # Producer code should check the partitionId before sending requests to > brokers. > # If there is a negative partition Id, just throw an IllegalStateException{{ > }}exception. > # Such a quick check can save lots of time for people debugging their > producer code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7572) Producer should not send requests with negative partition id
Yaodong Yang created KAFKA-7572: --- Summary: Producer should not send requests with negative partition id Key: KAFKA-7572 URL: https://issues.apache.org/jira/browse/KAFKA-7572 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 1.0.1 Reporter: Yaodong Yang h3. Issue: In one Kafka producer log from our users, we found the following weird one: timestamp="2018-10-09T17:37:41,237-0700",level="ERROR", Message="Write to Kafka failed with: ",exception="java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for topicName--2: 30042 ms has passed since batch creation plus linger time at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:94) at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:64) at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:29) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for topicName--2: 30042 ms has passed since batch creation plus linger time" After a few hours debugging, we finally understood the root cause of this issue: # The producer used a buggy custom Partitioner, which sometimes generates negative partition ids for new records. # The corresponding produce requests were rejected by brokers, because it's illegal to have a partition with a negative id. # The client kept refreshing its local cluster metadata, but could not send produce requests successfully. # From the above log, we found a suspicious string "topicName--2": # According to the source code, the format of this string in the log is TopicName+"-"+PartitionId. # It's not easy to notice that there were 2 consecutive dash in the above log. # Eventually, we found that the second dash was a negative sign. Therefore, the partition id is -2, rather than 2. # The bug the custom Partitioner. h3. Proposal: # Producer code should check the partitionId before sending requests to brokers. # If there is a negative partition Id, just throw an IllegalStateException{{ }}exception. # Such a quick check can save lots of time for people debugging their producer code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema
[ https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669203#comment-16669203 ] ASF GitHub Bot commented on KAFKA-7481: --- hachikuji opened a new pull request #5857: KAFKA-7481; Add upgrade/downgrade notes for 2.1.x URL: https://github.com/apache/kafka/pull/5857 We seemed to be missing the usual rolling upgrade instructions so I've added them and emphasized the impact for downgrades after bumping the inter-broker protocol version. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Consider options for safer upgrade of offset commit value schema > > > Key: KAFKA-7481 > URL: https://issues.apache.org/jira/browse/KAFKA-7481 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Fix For: 2.1.0 > > > KIP-211 and KIP-320 add new versions of the offset commit value schema. The > use of the new schema version is controlled by the > `inter.broker.protocol.version` configuration. Once the new inter-broker > version is in use, it is not possible to downgrade since the older brokers > will not be able to parse the new schema. > The options at the moment are the following: > 1. Do nothing. Users can try the new version and keep > `inter.broker.protocol.version` locked to the old release. Downgrade will > still be possible, but users will not be able to test new capabilities which > depend on inter-broker protocol changes. > 2. Instead of using `inter.broker.protocol.version`, we could use > `message.format.version`. This would basically extend the use of this config > to apply to all persistent formats. The advantage is that it allows users to > upgrade the broker and begin using the new inter-broker protocol while still > allowing downgrade. But features which depend on the persistent format could > not be tested. > Any other options? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7481) Consider options for safer upgrade of offset commit value schema
[ https://issues.apache.org/jira/browse/KAFKA-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669120#comment-16669120 ] Jason Gustafson commented on KAFKA-7481: I am also in favor of option 1 for the time being. While writing the KIP above, I felt the actual benefit to having a separate configuration was marginal and probably not worth the upgrade complexity that it adds. Note that I filed KAFKA-7570 to explore better options for managing the compatibility of the internal topic schemas. If we could figure that out, then we may not ultimately need the separate configuration. This discussion has also made it clear that we need a consistent and tested approach for downgrading Kafka. I also filed KAFKA-7571 to add system tests, which are lacking at the moment. Then we just need to agree on the long-term approach and document it. For now, I will go ahead and submit a PR to add some additional upgrade notes mentioning the incompatible changes to the offsets schema and the impact for downgrades. > Consider options for safer upgrade of offset commit value schema > > > Key: KAFKA-7481 > URL: https://issues.apache.org/jira/browse/KAFKA-7481 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Fix For: 2.1.0 > > > KIP-211 and KIP-320 add new versions of the offset commit value schema. The > use of the new schema version is controlled by the > `inter.broker.protocol.version` configuration. Once the new inter-broker > version is in use, it is not possible to downgrade since the older brokers > will not be able to parse the new schema. > The options at the moment are the following: > 1. Do nothing. Users can try the new version and keep > `inter.broker.protocol.version` locked to the old release. Downgrade will > still be possible, but users will not be able to test new capabilities which > depend on inter-broker protocol changes. > 2. Instead of using `inter.broker.protocol.version`, we could use > `message.format.version`. This would basically extend the use of this config > to apply to all persistent formats. The advantage is that it allows users to > upgrade the broker and begin using the new inter-broker protocol while still > allowing downgrade. But features which depend on the persistent format could > not be tested. > Any other options? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7571) Add system tests for downgrading Kafka
Jason Gustafson created KAFKA-7571: -- Summary: Add system tests for downgrading Kafka Key: KAFKA-7571 URL: https://issues.apache.org/jira/browse/KAFKA-7571 Project: Kafka Issue Type: Improvement Components: system tests Reporter: Jason Gustafson We have system tests which verify client behavior when upgrading Kafka. We should add similar tests for supported downgrades. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7570) Make internal offsets/transaction schemas forward compatible
Jason Gustafson created KAFKA-7570: -- Summary: Make internal offsets/transaction schemas forward compatible Key: KAFKA-7570 URL: https://issues.apache.org/jira/browse/KAFKA-7570 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson Currently changes to the data stored in the internal topics (__consumer_offsets and __transaction_state) are not forward compatible. This means that once users have upgraded to a Kafka version which includes a bumped schema, then it is no longer possible to downgrade. The changes to these schemas tend to be incremental, so we should consider options that at least allow new fields to be added without breaking downgrades. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7192) State-store can desynchronise with changelog
[ https://issues.apache.org/jira/browse/KAFKA-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669064#comment-16669064 ] Matthias J. Sax commented on KAFKA-7192: I hope so. Did not have time yet, to port the fix into 1.1 branch. > State-store can desynchronise with changelog > > > Key: KAFKA-7192 > URL: https://issues.apache.org/jira/browse/KAFKA-7192 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.3, 1.0.2, 1.1.1, 2.0.0 >Reporter: Jon Bates >Assignee: Guozhang Wang >Priority: Critical > Labels: bugs > Fix For: 0.11.0.4, 1.0.3, 2.0.1, 2.1.0 > > > n.b. this bug has been verified with exactly-once processing enabled > Consider the following scenario: > * A record, N is read into a Kafka topology > * the state store is updated > * the topology crashes > h3. *Expected behaviour:* > # Node is restarted > # Offset was never updated, so record N is reprocessed > # State-store is reset to position N-1 > # Record is reprocessed > h3. *Actual Behaviour* > # Node is restarted > # Record N is reprocessed (good) > # The state store has the state from the previous processing > I'd consider this a corruption of the state-store, hence the critical > Priority, although High may be more appropriate. > I wrote a proof-of-concept here, which demonstrates the problem on Linux: > [https://github.com/spadger/kafka-streams-sad-state-store] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7192) State-store can desynchronise with changelog
[ https://issues.apache.org/jira/browse/KAFKA-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668859#comment-16668859 ] Tobias Johansson commented on KAFKA-7192: - Will this be fixed in Kafka v1.1.2? > State-store can desynchronise with changelog > > > Key: KAFKA-7192 > URL: https://issues.apache.org/jira/browse/KAFKA-7192 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.3, 1.0.2, 1.1.1, 2.0.0 >Reporter: Jon Bates >Assignee: Guozhang Wang >Priority: Critical > Labels: bugs > Fix For: 0.11.0.4, 1.0.3, 2.0.1, 2.1.0 > > > n.b. this bug has been verified with exactly-once processing enabled > Consider the following scenario: > * A record, N is read into a Kafka topology > * the state store is updated > * the topology crashes > h3. *Expected behaviour:* > # Node is restarted > # Offset was never updated, so record N is reprocessed > # State-store is reset to position N-1 > # Record is reprocessed > h3. *Actual Behaviour* > # Node is restarted > # Record N is reprocessed (good) > # The state store has the state from the previous processing > I'd consider this a corruption of the state-store, hence the critical > Priority, although High may be more appropriate. > I wrote a proof-of-concept here, which demonstrates the problem on Linux: > [https://github.com/spadger/kafka-streams-sad-state-store] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-7165) Error while creating ephemeral at /brokers/ids/BROKER_ID
[ https://issues.apache.org/jira/browse/KAFKA-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668672#comment-16668672 ] Jonathan Santilli edited comment on KAFKA-7165 at 10/30/18 12:59 PM: - Thanks for your reply [~junrao] , that was the only time we had that issue of the _org.apache.zookeeper.KeeperException$SessionExpiredException_ so far. Now we are in version 2.0 of Kafka and from time to time suffering the *NODEEXISTS* issue. Maybe the errors were related but difficult to ensure that, hopefully with the fix, we can get rid of the *NODEEXISTS* error. Cheers! was (Author: pachilo): Thanks for your reply [~junrao] , that was the only time we had that issue of the _org.apache.zookeeper.KeeperException$SessionExpiredException_ so far. Now we are in version 2.0 of Kafka and from time to time suffering the *NODEEXISTS* issue. Maybe the errors were related but difficult to ensure that, hopefully with the fix, we can get rid of the *NODEEXISTS* error*.* Cheers! > Error while creating ephemeral at /brokers/ids/BROKER_ID > > > Key: KAFKA-7165 > URL: https://issues.apache.org/jira/browse/KAFKA-7165 > Project: Kafka > Issue Type: Bug > Components: core, zkclient >Affects Versions: 1.1.0 >Reporter: Jonathan Santilli >Assignee: Jonathan Santilli >Priority: Major > > Kafka version: 1.1.0 > Zookeeper version: 3.4.12 > 4 Kafka Brokers > 4 Zookeeper servers > > In one of the 4 brokers of the cluster, we detect the following error: > [2018-07-14 04:38:23,784] INFO Unable to read additional data from server > sessionid 0x3000c2420cb458d, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:24,509] INFO Opening socket connection to server > *ZOOKEEPER_SERVER_1:PORT*. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:24,510] INFO Socket connection established to > *ZOOKEEPER_SERVER_1:PORT*, initiating session > (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:24,513] INFO Unable to read additional data from server > sessionid 0x3000c2420cb458d, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:25,287] INFO Opening socket connection to server > *ZOOKEEPER_SERVER_2:PORT*. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:25,287] INFO Socket connection established to > *ZOOKEEPER_SERVER_2:PORT*, initiating session > (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:25,954] INFO [Partition TOPIC_NAME-PARTITION-# broker=1|#* > broker=1] Shrinking ISR from 1,3,4,2 to 1,4,2 (kafka.cluster.Partition) > [2018-07-14 04:38:26,444] WARN Unable to reconnect to ZooKeeper service, > session 0x3000c2420cb458d has expired (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,444] INFO Unable to reconnect to ZooKeeper service, > session 0x3000c2420cb458d has expired, closing socket connection > (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,445] INFO EventThread shut down for session: > 0x3000c2420cb458d (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,446] INFO [ZooKeeperClient] Session expired. > (kafka.zookeeper.ZooKeeperClient) > [2018-07-14 04:38:26,459] INFO [ZooKeeperClient] Initializing a new session > to > *ZOOKEEPER_SERVER_1:PORT*,*ZOOKEEPER_SERVER_2:PORT*,*ZOOKEEPER_SERVER_3:PORT*,*ZOOKEEPER_SERVER_4:PORT*. > (kafka.zookeeper.ZooKeeperClient) > [2018-07-14 04:38:26,459] INFO Initiating client connection, > connectString=*ZOOKEEPER_SERVER_1:PORT*,*ZOOKEEPER_SERVER_2:PORT*,*ZOOKEEPER_SERVER_3:PORT*,*ZOOKEEPER_SERVER_4:PORT* > sessionTimeout=6000 > watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@44821a96 > (org.apache.zookeeper.ZooKeeper) > [2018-07-14 04:38:26,465] INFO Opening socket connection to server > *ZOOKEEPER_SERVER_1:PORT*. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,477] INFO Socket connection established to > *ZOOKEEPER_SERVER_1:PORT*, initiating session > (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,484] INFO Session establishment complete on server > *ZOOKEEPER_SERVER_1:PORT*, sessionid = 0x4005b59eb6a, negotiated timeout > = 6000 (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,496] *INFO Creating /brokers/ids/1* (is it secure? > false) (kafka.zk.KafkaZkClient) > [2018-07-14 04:38:26,500] INFO Processing notification(s) to /config/changes > (kafka.common.ZkNodeChangeNotificationListener) > *[2018-07-14 04:38:26,547] ERROR Error while
[jira] [Commented] (KAFKA-7165) Error while creating ephemeral at /brokers/ids/BROKER_ID
[ https://issues.apache.org/jira/browse/KAFKA-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668672#comment-16668672 ] Jonathan Santilli commented on KAFKA-7165: -- Thanks for your reply [~junrao] , that was the only time we had that issue of the _org.apache.zookeeper.KeeperException$SessionExpiredException_ so far. Now we are in version 2.0 of Kafka and from time to time suffering the *NODEEXISTS* issue. Maybe the errors were related but difficult to ensure that, hopefully with the fix, we can get rid of the *NODEEXISTS* error*.* Cheers! > Error while creating ephemeral at /brokers/ids/BROKER_ID > > > Key: KAFKA-7165 > URL: https://issues.apache.org/jira/browse/KAFKA-7165 > Project: Kafka > Issue Type: Bug > Components: core, zkclient >Affects Versions: 1.1.0 >Reporter: Jonathan Santilli >Assignee: Jonathan Santilli >Priority: Major > > Kafka version: 1.1.0 > Zookeeper version: 3.4.12 > 4 Kafka Brokers > 4 Zookeeper servers > > In one of the 4 brokers of the cluster, we detect the following error: > [2018-07-14 04:38:23,784] INFO Unable to read additional data from server > sessionid 0x3000c2420cb458d, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:24,509] INFO Opening socket connection to server > *ZOOKEEPER_SERVER_1:PORT*. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:24,510] INFO Socket connection established to > *ZOOKEEPER_SERVER_1:PORT*, initiating session > (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:24,513] INFO Unable to read additional data from server > sessionid 0x3000c2420cb458d, likely server has closed socket, closing socket > connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:25,287] INFO Opening socket connection to server > *ZOOKEEPER_SERVER_2:PORT*. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:25,287] INFO Socket connection established to > *ZOOKEEPER_SERVER_2:PORT*, initiating session > (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:25,954] INFO [Partition TOPIC_NAME-PARTITION-# broker=1|#* > broker=1] Shrinking ISR from 1,3,4,2 to 1,4,2 (kafka.cluster.Partition) > [2018-07-14 04:38:26,444] WARN Unable to reconnect to ZooKeeper service, > session 0x3000c2420cb458d has expired (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,444] INFO Unable to reconnect to ZooKeeper service, > session 0x3000c2420cb458d has expired, closing socket connection > (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,445] INFO EventThread shut down for session: > 0x3000c2420cb458d (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,446] INFO [ZooKeeperClient] Session expired. > (kafka.zookeeper.ZooKeeperClient) > [2018-07-14 04:38:26,459] INFO [ZooKeeperClient] Initializing a new session > to > *ZOOKEEPER_SERVER_1:PORT*,*ZOOKEEPER_SERVER_2:PORT*,*ZOOKEEPER_SERVER_3:PORT*,*ZOOKEEPER_SERVER_4:PORT*. > (kafka.zookeeper.ZooKeeperClient) > [2018-07-14 04:38:26,459] INFO Initiating client connection, > connectString=*ZOOKEEPER_SERVER_1:PORT*,*ZOOKEEPER_SERVER_2:PORT*,*ZOOKEEPER_SERVER_3:PORT*,*ZOOKEEPER_SERVER_4:PORT* > sessionTimeout=6000 > watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@44821a96 > (org.apache.zookeeper.ZooKeeper) > [2018-07-14 04:38:26,465] INFO Opening socket connection to server > *ZOOKEEPER_SERVER_1:PORT*. Will not attempt to authenticate using SASL > (unknown error) (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,477] INFO Socket connection established to > *ZOOKEEPER_SERVER_1:PORT*, initiating session > (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,484] INFO Session establishment complete on server > *ZOOKEEPER_SERVER_1:PORT*, sessionid = 0x4005b59eb6a, negotiated timeout > = 6000 (org.apache.zookeeper.ClientCnxn) > [2018-07-14 04:38:26,496] *INFO Creating /brokers/ids/1* (is it secure? > false) (kafka.zk.KafkaZkClient) > [2018-07-14 04:38:26,500] INFO Processing notification(s) to /config/changes > (kafka.common.ZkNodeChangeNotificationListener) > *[2018-07-14 04:38:26,547] ERROR Error while creating ephemeral at > /brokers/ids/1, node already exists and owner '216186131422332301' does not > match current session '288330817911521280' > (kafka.zk.KafkaZkClient$CheckedEphemeral)* > [2018-07-14 04:38:26,547] *INFO Result of znode creation at /brokers/ids/1 > is: NODEEXISTS* (kafka.zk.KafkaZkClient) > [2018-07-14 04:38:26,559] ERROR Uncaught exception in scheduled task > 'isr-expiration' (kafka.utils.KafkaScheduler) >
[jira] [Created] (KAFKA-7569) Kafka doesnt appear to cleanup dangling partitions
Dao Quang Minh created KAFKA-7569: - Summary: Kafka doesnt appear to cleanup dangling partitions Key: KAFKA-7569 URL: https://issues.apache.org/jira/browse/KAFKA-7569 Project: Kafka Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dao Quang Minh In our current cluster running kafka 1.0.0, we recently observed that Kafka doesnt cleanup dangling partitions ( i.e. partion data on disk, but partition is not assigned to the current broker anymore ). For example of the dangling partition data, we have: {code} total 26G -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:19 352433304663.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:16 352414164340.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:24 352466972892.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:23 352457368236.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:17 352423709566.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:21 352447702369.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:20 352442921890.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:22 352452551548.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:17 352418945305.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:18 352428477361.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:15 352409416538.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:24 352462192103.log -rw-r--r-- 1 kafka kafka 1.9G Aug 6 16:20 352438136012.log -rw-r--r-- 1 kafka kafka 1.8G Aug 6 17:43 352471757311.log -rw-r--r-- 1 kafka kafka 10M Oct 16 21:44 352471757311.index -rw-r--r-- 1 kafka kafka 10M Oct 16 21:44 352471757311.timeindex drwxr-xr-x 2 kafka kafka 4.0K Oct 8 15:27 . drwxr-xr-x 49 kafka kafka 4.0K Oct 30 11:21 .. -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352414164340.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352423709566.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352433304663.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352447702369.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352457368236.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352466972892.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352409416538.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352418945305.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352428477361.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352438136012.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352442921890.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352452551548.timeindex -rw-r--r-- 1 kafka kafka 2.3K Oct 16 21:44 352462192103.timeindex -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352414164340.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352423709566.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352433304663.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352447702369.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352457368236.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352466972892.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352409416538.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352418945305.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352428477361.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352438136012.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352442921890.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352452551548.index -rw-r--r-- 1 kafka kafka 1.5K Oct 16 21:44 352462192103.index -rw-r--r-- 1 kafka kafka 20 Aug 6 16:23 leader-epoch-checkpoint -rw-r--r-- 1 kafka kafka 10 Aug 6 16:24 352466972892.snapshot -rw-r--r-- 1 kafka kafka 10 Aug 6 16:24 352471757311.snapshot -rw-r--r-- 1 kafka kafka 10 Oct 8 15:27 352476186724.snapshot {code} I'm unsure how we ended up in this situation as partition data should be marked as removed and eventually remove when it's not assigned to the broker anymore. But in this edge case, should Kafka detect that automatically when it loads the partition and re-mark it as to be deleted again ? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-2607) Review `Time` interface and its usage
[ https://issues.apache.org/jira/browse/KAFKA-2607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668325#comment-16668325 ] Aleksei Kogan commented on KAFKA-2607: -- [~ijuma] Hello! I would like to help with this task, as existing pull request is completely outdated. Could you assign it to me? > Review `Time` interface and its usage > - > > Key: KAFKA-2607 > URL: https://issues.apache.org/jira/browse/KAFKA-2607 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.11.0.0, 1.0.0 >Reporter: Ismael Juma >Priority: Major > Labels: newbie > > Two of `Time` interface's methods are `milliseconds` and `nanoseconds` which > are implemented in `SystemTime` as follows: > {code} > @Override > public long milliseconds() { > return System.currentTimeMillis(); > } > @Override > public long nanoseconds() { > return System.nanoTime(); > } > {code} > The issue with this interface is that it makes it seem that the difference is > about the unit (`ms` versus `ns`) whereas it's much more than that: > https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks > We should probably change the names of the methods and review our usage to > see if we're using the right one in the various places. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7568) Return leader epoch in ListOffsets responses
[ https://issues.apache.org/jira/browse/KAFKA-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668262#comment-16668262 ] ASF GitHub Bot commented on KAFKA-7568: --- hachikuji opened a new pull request #5855: KAFKA-7568; Return leader epoch in ListOffsets response URL: https://github.com/apache/kafka/pull/5855 As part of KIP-320, the ListOffsets API should return the leader epoch of any fetched offset. We either get this epoch from the log itself for a timestamp query or from the epoch cache if we are searching the earliest or latest offset in the log. When handling queries for the latest offset, we have elected to choose the current leader epoch, which is consistent with other handling (e.g. OffsetsForTimes). ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Return leader epoch in ListOffsets responses > > > Key: KAFKA-7568 > URL: https://issues.apache.org/jira/browse/KAFKA-7568 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > > This is part of KIP-320. The changes to the API have already been made, but > currently we return unknown epoch. We need to update the logic to search for > the epoch corresponding to a fetched offset in the leader epoch cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005)