[jira] [Commented] (KAFKA-7794) kafka.tools.GetOffsetShell does not return the offset in some cases
[ https://issues.apache.org/jira/browse/KAFKA-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767814#comment-16767814 ] huxihx commented on KAFKA-7794: --- There are some inconsistencies here where GetOffsetShell wants to seek the largest offset before the given timestamp, but ListOffsetRequest retrieves the smallest offset after the timestamp (See comments of method `fetchOffsetByTimestamp` in Log.scala). That's why you'll get all the log start offsets if specifying a very old timestamp, but you get nothing when a future timestamp is given. A naive solution is to correct the description of `–time` for GetOffsetShell :) > kafka.tools.GetOffsetShell does not return the offset in some cases > --- > > Key: KAFKA-7794 > URL: https://issues.apache.org/jira/browse/KAFKA-7794 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 0.10.2.0, 0.10.2.1, 0.10.2.2 >Reporter: Daniele Ascione >Assignee: Kartik >Priority: Critical > Labels: Kafka, ShellCommands, kafka-0.10, offset, shell, > shell-script, shellscript, tools, usability > Attachments: image-2019-02-11-20-51-07-805.png, > image-2019-02-11-20-56-13-362.png, image-2019-02-11-20-57-03-579.png, > image-2019-02-12-16-19-25-170.png, image-2019-02-12-16-21-13-126.png, > image-2019-02-12-16-23-38-399.png, image-2019-02-13-11-43-24-128.png, > image-2019-02-13-11-43-28-873.png, image-2019-02-13-11-44-18-736.png, > image-2019-02-13-11-45-21-459.png > > > For some input for the timestamps (different from -1 or -2) the GetOffset is > not able to retrieve the offset. > For example, if _x_ is the timestamp in that "not working range", and you > execute: > {code:java} > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time x > {code} > The output is: > {code:java} > MY_TOPIC:8: > MY_TOPIC:2: > MY_TOPIC:5: > MY_TOPIC:4: > MY_TOPIC:7: > MY_TOPIC:1: > MY_TOPIC:9:{code} > while after the last ":" an integer representing the offset is expected. > > Steps to reproduce it: > # Consume all the messages from the beginning and print the timestamp: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true > > messages{code} > # Sort the messages by timestamp and get some of the oldest messages: > {code:java} > awk -F "CreateTime:" '{ print $2}' messages | sort -n > msg_sorted{code} > # Take (for example) the timestamp of the 10th oldest message, and see if > GetOffsetShell is not able to print the offset: > {code:java} > timestamp="$(sed '10q;d' msg_sorted | cut -f1)" > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time $timestamp > # The output should be something like: > # MY_TOPIC:1: > # MY_TOPIC:2: > (repeated for every partition){code} > # Verify that the message with that timestamp is still in Kafka: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true | grep > "CreateTime:$timestamp" {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7794) kafka.tools.GetOffsetShell does not return the offset in some cases
[ https://issues.apache.org/jira/browse/KAFKA-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766934#comment-16766934 ] Daniele Ascione commented on KAFKA-7794: [~kartikvk1996] I've used Kafka 0.10.2.1 > kafka.tools.GetOffsetShell does not return the offset in some cases > --- > > Key: KAFKA-7794 > URL: https://issues.apache.org/jira/browse/KAFKA-7794 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 0.10.2.0, 0.10.2.1, 0.10.2.2 >Reporter: Daniele Ascione >Assignee: Kartik >Priority: Critical > Labels: Kafka, ShellCommands, kafka-0.10, offset, shell, > shell-script, shellscript, tools, usability > Attachments: image-2019-02-11-20-51-07-805.png, > image-2019-02-11-20-56-13-362.png, image-2019-02-11-20-57-03-579.png, > image-2019-02-12-16-19-25-170.png, image-2019-02-12-16-21-13-126.png, > image-2019-02-12-16-23-38-399.png, image-2019-02-13-11-43-24-128.png, > image-2019-02-13-11-43-28-873.png, image-2019-02-13-11-44-18-736.png, > image-2019-02-13-11-45-21-459.png > > > For some input for the timestamps (different from -1 or -2) the GetOffset is > not able to retrieve the offset. > For example, if _x_ is the timestamp in that "not working range", and you > execute: > {code:java} > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time x > {code} > The output is: > {code:java} > MY_TOPIC:8: > MY_TOPIC:2: > MY_TOPIC:5: > MY_TOPIC:4: > MY_TOPIC:7: > MY_TOPIC:1: > MY_TOPIC:9:{code} > while after the last ":" an integer representing the offset is expected. > > Steps to reproduce it: > # Consume all the messages from the beginning and print the timestamp: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true > > messages{code} > # Sort the messages by timestamp and get some of the oldest messages: > {code:java} > awk -F "CreateTime:" '{ print $2}' messages | sort -n > msg_sorted{code} > # Take (for example) the timestamp of the 10th oldest message, and see if > GetOffsetShell is not able to print the offset: > {code:java} > timestamp="$(sed '10q;d' msg_sorted | cut -f1)" > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time $timestamp > # The output should be something like: > # MY_TOPIC:1: > # MY_TOPIC:2: > (repeated for every partition){code} > # Verify that the message with that timestamp is still in Kafka: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true | grep > "CreateTime:$timestamp" {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7794) kafka.tools.GetOffsetShell does not return the offset in some cases
[ https://issues.apache.org/jira/browse/KAFKA-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766828#comment-16766828 ] Kartik commented on KAFKA-7794: --- [~audhumla] I tried your steps and for me, it's working as expected. # Created a new topic with 10 partitions and replication factor = 1 # Inserted 5000 rows !image-2019-02-13-11-43-28-873.png! # Consumed all the messages !image-2019-02-13-11-44-18-736.png! # Now if I give 10th timestamp value I get the offset properly. !image-2019-02-13-11-45-21-459.png! Can you tell me which version you are testing? Looks like in new version the issue is fixed. > kafka.tools.GetOffsetShell does not return the offset in some cases > --- > > Key: KAFKA-7794 > URL: https://issues.apache.org/jira/browse/KAFKA-7794 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 0.10.2.0, 0.10.2.1, 0.10.2.2 >Reporter: Daniele Ascione >Assignee: Kartik >Priority: Critical > Labels: Kafka, ShellCommands, kafka-0.10, offset, shell, > shell-script, shellscript, tools, usability > Attachments: image-2019-02-11-20-51-07-805.png, > image-2019-02-11-20-56-13-362.png, image-2019-02-11-20-57-03-579.png, > image-2019-02-12-16-19-25-170.png, image-2019-02-12-16-21-13-126.png, > image-2019-02-12-16-23-38-399.png, image-2019-02-13-11-43-24-128.png, > image-2019-02-13-11-43-28-873.png, image-2019-02-13-11-44-18-736.png, > image-2019-02-13-11-45-21-459.png > > > For some input for the timestamps (different from -1 or -2) the GetOffset is > not able to retrieve the offset. > For example, if _x_ is the timestamp in that "not working range", and you > execute: > {code:java} > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time x > {code} > The output is: > {code:java} > MY_TOPIC:8: > MY_TOPIC:2: > MY_TOPIC:5: > MY_TOPIC:4: > MY_TOPIC:7: > MY_TOPIC:1: > MY_TOPIC:9:{code} > while after the last ":" an integer representing the offset is expected. > > Steps to reproduce it: > # Consume all the messages from the beginning and print the timestamp: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true > > messages{code} > # Sort the messages by timestamp and get some of the oldest messages: > {code:java} > awk -F "CreateTime:" '{ print $2}' messages | sort -n > msg_sorted{code} > # Take (for example) the timestamp of the 10th oldest message, and see if > GetOffsetShell is not able to print the offset: > {code:java} > timestamp="$(sed '10q;d' msg_sorted | cut -f1)" > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time $timestamp > # The output should be something like: > # MY_TOPIC:1: > # MY_TOPIC:2: > (repeated for every partition){code} > # Verify that the message with that timestamp is still in Kafka: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true | grep > "CreateTime:$timestamp" {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7794) kafka.tools.GetOffsetShell does not return the offset in some cases
[ https://issues.apache.org/jira/browse/KAFKA-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766139#comment-16766139 ] Daniele Ascione commented on KAFKA-7794: [~kartikvk1996], I am able to reproduce the behaviour on my machine. I am using Kafka in Red Hat Enterprise Linux Server release 7.3 (Maipo). You could produce more messages using: {code:java} bin/kafka-producer-perf-test.sh --topic demo --throughput 10 --num-records 5 --record-size 5 --producer-props bootstrap.servers=127.0.0.1:9092 {code} If you then take a message using the procedure I described, you should have. I executed in my system right now, then I typed: {code:java} bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic test --time 1549983817353 {code} and the output is: !image-2019-02-12-16-19-25-170.png! I have some messages after and before that timestamp: !image-2019-02-12-16-23-38-399.png! > kafka.tools.GetOffsetShell does not return the offset in some cases > --- > > Key: KAFKA-7794 > URL: https://issues.apache.org/jira/browse/KAFKA-7794 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 0.10.2.0, 0.10.2.1, 0.10.2.2 >Reporter: Daniele Ascione >Assignee: Kartik >Priority: Critical > Labels: Kafka, ShellCommands, kafka-0.10, offset, shell, > shell-script, shellscript, tools, usability > Attachments: image-2019-02-11-20-51-07-805.png, > image-2019-02-11-20-56-13-362.png, image-2019-02-11-20-57-03-579.png, > image-2019-02-12-16-19-25-170.png, image-2019-02-12-16-21-13-126.png, > image-2019-02-12-16-23-38-399.png > > > For some input for the timestamps (different from -1 or -2) the GetOffset is > not able to retrieve the offset. > For example, if _x_ is the timestamp in that "not working range", and you > execute: > {code:java} > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time x > {code} > The output is: > {code:java} > MY_TOPIC:8: > MY_TOPIC:2: > MY_TOPIC:5: > MY_TOPIC:4: > MY_TOPIC:7: > MY_TOPIC:1: > MY_TOPIC:9:{code} > while after the last ":" an integer representing the offset is expected. > > Steps to reproduce it: > # Consume all the messages from the beginning and print the timestamp: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true > > messages{code} > # Sort the messages by timestamp and get some of the oldest messages: > {code:java} > awk -F "CreateTime:" '{ print $2}' messages | sort -n > msg_sorted{code} > # Take (for example) the timestamp of the 10th oldest message, and see if > GetOffsetShell is not able to print the offset: > {code:java} > timestamp="$(sed '10q;d' msg_sorted | cut -f1)" > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time $timestamp > # The output should be something like: > # MY_TOPIC:1: > # MY_TOPIC:2: > (repeated for every partition){code} > # Verify that the message with that timestamp is still in Kafka: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true | grep > "CreateTime:$timestamp" {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7794) kafka.tools.GetOffsetShell does not return the offset in some cases
[ https://issues.apache.org/jira/browse/KAFKA-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766100#comment-16766100 ] Kartik commented on KAFKA-7794: --- [~ijuma] Can you help me here? Should we return latest offset for the timestamp > latest committed record timestamp or just throw an error message. So that I can work accordingly on this issue. > kafka.tools.GetOffsetShell does not return the offset in some cases > --- > > Key: KAFKA-7794 > URL: https://issues.apache.org/jira/browse/KAFKA-7794 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 0.10.2.0, 0.10.2.1, 0.10.2.2 >Reporter: Daniele Ascione >Assignee: Kartik >Priority: Critical > Labels: Kafka, ShellCommands, kafka-0.10, offset, shell, > shell-script, shellscript, tools, usability > Attachments: image-2019-02-11-20-51-07-805.png, > image-2019-02-11-20-56-13-362.png, image-2019-02-11-20-57-03-579.png > > > For some input for the timestamps (different from -1 or -2) the GetOffset is > not able to retrieve the offset. > For example, if _x_ is the timestamp in that "not working range", and you > execute: > {code:java} > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time x > {code} > The output is: > {code:java} > MY_TOPIC:8: > MY_TOPIC:2: > MY_TOPIC:5: > MY_TOPIC:4: > MY_TOPIC:7: > MY_TOPIC:1: > MY_TOPIC:9:{code} > while after the last ":" an integer representing the offset is expected. > > Steps to reproduce it: > # Consume all the messages from the beginning and print the timestamp: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true > > messages{code} > # Sort the messages by timestamp and get some of the oldest messages: > {code:java} > awk -F "CreateTime:" '{ print $2}' messages | sort -n > msg_sorted{code} > # Take (for example) the timestamp of the 10th oldest message, and see if > GetOffsetShell is not able to print the offset: > {code:java} > timestamp="$(sed '10q;d' msg_sorted | cut -f1)" > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time $timestamp > # The output should be something like: > # MY_TOPIC:1: > # MY_TOPIC:2: > (repeated for every partition){code} > # Verify that the message with that timestamp is still in Kafka: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true | grep > "CreateTime:$timestamp" {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7794) kafka.tools.GetOffsetShell does not return the offset in some cases
[ https://issues.apache.org/jira/browse/KAFKA-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765684#comment-16765684 ] Kartik commented on KAFKA-7794: --- Hi [~huxi_2b] , tagging you because you might know this. Ideally when the timestamp is provided > latest committed record timestamp, then is it good to return the latest offset right? or you want the error message should be thrown. based on your comment, I can work on this. > kafka.tools.GetOffsetShell does not return the offset in some cases > --- > > Key: KAFKA-7794 > URL: https://issues.apache.org/jira/browse/KAFKA-7794 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 0.10.2.0, 0.10.2.1, 0.10.2.2 >Reporter: Daniele Ascione >Priority: Critical > Labels: Kafka, ShellCommands, kafka-0.10, offset, shell, > shell-script, shellscript, tools, usability > Attachments: image-2019-02-11-20-51-07-805.png, > image-2019-02-11-20-56-13-362.png, image-2019-02-11-20-57-03-579.png > > > For some input for the timestamps (different from -1 or -2) the GetOffset is > not able to retrieve the offset. > For example, if _x_ is the timestamp in that "not working range", and you > execute: > {code:java} > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time x > {code} > The output is: > {code:java} > MY_TOPIC:8: > MY_TOPIC:2: > MY_TOPIC:5: > MY_TOPIC:4: > MY_TOPIC:7: > MY_TOPIC:1: > MY_TOPIC:9:{code} > while after the last ":" an integer representing the offset is expected. > > Steps to reproduce it: > # Consume all the messages from the beginning and print the timestamp: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true > > messages{code} > # Sort the messages by timestamp and get some of the oldest messages: > {code:java} > awk -F "CreateTime:" '{ print $2}' messages | sort -n > msg_sorted{code} > # Take (for example) the timestamp of the 10th oldest message, and see if > GetOffsetShell is not able to print the offset: > {code:java} > timestamp="$(sed '10q;d' msg_sorted | cut -f1)" > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time $timestamp > # The output should be something like: > # MY_TOPIC:1: > # MY_TOPIC:2: > (repeated for every partition){code} > # Verify that the message with that timestamp is still in Kafka: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true | grep > "CreateTime:$timestamp" {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7794) kafka.tools.GetOffsetShell does not return the offset in some cases
[ https://issues.apache.org/jira/browse/KAFKA-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765075#comment-16765075 ] Kartik commented on KAFKA-7794: --- [~audhumla] I tried your steps and I was able to get the offset properly. The offset won't be visible if you are giving the timestamp > recent record added timestamp. !image-2019-02-11-20-51-07-805.png! If I provide the timestamp > 1549897929598 like '1549897929599' you won't get the offset. !image-2019-02-11-20-57-03-579.png! When I provide the proper timestamp I get the proper offset !image-2019-02-11-20-56-13-362.png! Let me know If I am wrong. > kafka.tools.GetOffsetShell does not return the offset in some cases > --- > > Key: KAFKA-7794 > URL: https://issues.apache.org/jira/browse/KAFKA-7794 > Project: Kafka > Issue Type: Bug > Components: tools >Affects Versions: 0.10.2.0, 0.10.2.1, 0.10.2.2 >Reporter: Daniele Ascione >Priority: Critical > Labels: Kafka, ShellCommands, kafka-0.10, offset, shell, > shell-script, shellscript, tools, usability > Attachments: image-2019-02-11-20-51-07-805.png, > image-2019-02-11-20-56-13-362.png, image-2019-02-11-20-57-03-579.png > > > For some input for the timestamps (different from -1 or -2) the GetOffset is > not able to retrieve the offset. > For example, if _x_ is the timestamp in that "not working range", and you > execute: > {code:java} > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time x > {code} > The output is: > {code:java} > MY_TOPIC:8: > MY_TOPIC:2: > MY_TOPIC:5: > MY_TOPIC:4: > MY_TOPIC:7: > MY_TOPIC:1: > MY_TOPIC:9:{code} > while after the last ":" an integer representing the offset is expected. > > Steps to reproduce it: > # Consume all the messages from the beginning and print the timestamp: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true > > messages{code} > # Sort the messages by timestamp and get some of the oldest messages: > {code:java} > awk -F "CreateTime:" '{ print $2}' messages | sort -n > msg_sorted{code} > # Take (for example) the timestamp of the 10th oldest message, and see if > GetOffsetShell is not able to print the offset: > {code:java} > timestamp="$(sed '10q;d' msg_sorted | cut -f1)" > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --time $timestamp > # The output should be something like: > # MY_TOPIC:1: > # MY_TOPIC:2: > (repeated for every partition){code} > # Verify that the message with that timestamp is still in Kafka: > {code:java} > bin/kafka-simple-consumer-shell.sh --no-wait-at-logend --broker-list > $KAFKA_ADDRESS --topic $MY_TOPIC --property print.timestamp=true | grep > "CreateTime:$timestamp" {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)